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PRINCIPAL COMPONENT ANALYSIS FOR SEMIMARTINGALES AND 

STOCHASTIC PDE 

ALBERTO OHASHI AND ALEXANDRE B. SIMAS 


Abstract. In this work, we develop a novel principal component analysis (PCA) for semimartingales 
by introducing a suitable spectral analysis for the quadratic variation operator. Motivated by high¬ 
dimensional complex systems typically found in interest rate markets, we investigate correlation in 
high-dimensional high-frequency data generated by continuous semimartingales. In contrast to the 
traditional PCA methodology, the directions of large variations are not deterministic, but rather 
they are bounded variation adapted processes which maximize quadratic variation almost surely. 
This allows us to reduce dimensionality from high-dimensional semimartingale systems in terms of 
quadratic covariation rather than the usual covariance concept. 

The proposed methodology allows us to investigate space-time data driven by multi-dimensional 
latent semimartingale state processes. The theory is applied to discretely-observed stochastic PDEs 
which admit finite-dimensional realizations. In particular, we provide consistent estimators for finite¬ 
dimensional invariant manifolds for Heath-Jarrow-Morton models. More importantly, components 
of the invariant manifold associated to volatility and drift dynamics are consistently estimated and 
identified. The proposed methodology is illustrated with both simulated and real data sets. 
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1. Introduction 

Dimension reduction techniques have been intensively studied over the last years due to the advent 
of high-dimensional data in a variety of applied fields. Towards an effective reduction dimension, it 
is crucial to interpret correctly what kind of lower dimensional manifold one has to find in order to 
represent the data properly. For instance, if the second moment structure reasonable describes the 
dynamics in the data, then the classical Principal Component Analysis (henceforth abbreviated by 
PCA) and its various extensions are the natural candidates to reduce dimensionality. 

There are many cases where correlation in high-dimensional systems may not be accurately de¬ 
scribed by covariance structures. An important example is the correlation typically found in high- 
frequency data which is better described by the so-called quadratic variation matrix 


[M]t := [M\M^]t,l< i,j <d-,0<t<T, 

where M = {M ^,..., M‘^) is a d-dimensional semimartingale sampled over the time horizon [0, T] and 
is the quadratic covariation process between M* and . 

In a financial context, the process [M], is called the volatility matrix (sometimes called integrated 
volatility). The total amount of volatility in a d-dimensional semimartingale system over [0,t] is fully 
described by the following quantity 


d 

i=i 

where are the random eigenvalues of [M]t and || • ||( 2 ) is the usual Hilbert-Schmidt norm. 

Volatility is by far the most important quantity which needs to be estimated for asset pricing, asset 
allocation and risk management, specially in high-dimensional portfolios. The estimation of high¬ 
dimensional quadratic variation matrices has been a topic of great interest in the last years. We refer 
the reader to the works [inii5ni[5iii39i[5a[iai23i[2i] and other references therein. 

Despite all the recent progress on volatility matrix estimation, there has been remarkably little 
fundamental theoretical study on dimension reduction techniques based on high-dimensional quadratic 
variation matrices. One notorious difficulty is the dynamic interpretation of directions and principal 
components over the time horizon which in typical cases is formulated in a high-frequency domain. 
Indeed, 0 < t < T} is fully random which makes the analysis more evolved than the standard 

PCA. More precisely, all the potentially optimal projections will be stochastic processes rather than 
deterministic vectors. 

In view of the fact many correlation structures in high-dimensional data are fully represented 
by the quadratic variation concept, it is natural and necessary to construct a dimension reduction 
methodology strictly associated to \M\ rather than on classical covariance or conditional distributions. 
This is the program we start to carry out in this paper. 

1.1. Contributions. Let M = (M^, ..., M‘^) be a d-dimensional semimartingale. The starting point 
of the analysis is to solve an identification problem related to a possible singularity of the random 
matrix [M]t which can be typically found e.g in large portfolios of financial assets (see e.g Burashi, 
Porchia and Trojani m, Ait-Sahalia and Xiu [3] and Fan, Li and Yu [23] and other references therein) 
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and affine term structure models (see e.g Bjork and Landen [15] and Filipovic [29] and Filipovic and 
Sharef [27] 1. More precisely, in the presence of non-trivial correlation among semimartingales, one has 
rank [M]t < d a.s and then, under mild assumptions, one can split the set Ai = span {M ^,..., M‘^} 
into two complementary linear spaces (W, D) such that 

M = yv®v 

where W and T) contains only elements of M with non-zero and zero quadratic variation, respectively. 
The space W fully describes the volatility structure of Ai while V is responsible for its hidden pure 
drift (null quadratic variation) dynamics. At this point, we stress that the potential singularity of the 
quadratic variation matrix [M]t introduces non-observable drift components into M which cannot be 
discarded in a high-frequency situation. Both spaces are equally important to explain the dynamics 
of M in a given physical probability measure. In strong contrast, directions with null variance can be 
fully discarded in the classical PCA. This is the first major difference between the classical PCA and 
the theory developed in this article. 

We follow the natural and simple idea to seek random variables Vt = (u^,..., such that 

^vlMi 

i=i 

has the largest possible instantaneous quadratic variation over [0, t], where Vt is interpreted as a random 
coefficient at time t G [0,T] rather than a process. By iterating this procedure in an orthogonal way, 
we shall get a linear transformation of M which under some mild primitive conditions will be a 
finite-dimensional semimartingale ranked in terms of quadratic variation. Starting with consistent 
estimators for the quadratic variation matrix [M]t (see e.g [T9j |50j |5TJ |39j 152] [TS] |23l |21] and 
other references therein), we are able to propose consistent estimators for (>V,T>) by means of a 
simple eigenvalue analysis of \M\rp based on high-frequency observations of M. This allows us to 
reduce dimensionality in terms of quadratic variation in a very clear and consistent way. Equally 
important, the methodology also estimates bounded variation components in D which can not be 
neglected in multi-dimensional semimartingale systems. 

The PCA for semimartingales introduced in the first part of the article is applied to the estimation 
of principal components of discretely-observed space-time semimartingales which describe stochastic 
partial differential equations (henceforth abbreviated by stochastic PDEs) admitting finite-dimensional 
realizations. In particular, in the second part of this article, we illustrate the theory by studying the 
problem of the estimation of the so-called finite-dimensional invariant manifolds w.r.t to a stochastic 
PDE 

m 

(1.1) drt = (^A{rt) -I- F{rt)'^dt +'^ a^rt)dBf ■ ro = h G E-,0 <t <T, 

1=1 

where A is a potentially infinite-dimensional Sobolev-type space of continuous functions and (A, F, cr*; 1 < 
i < m) satisfy standard assumptions for the existence of solution. 

Many space-time phenomena in natural and social sciences can be described by solutions of sto¬ 
chastic PDEs like (HU. However, the intrinsic infinite-dimensionality of space-time data generated 
by models like (ED creates a big challenge in the statistical analysis of these models. In particular 
cases, it is well-known that one can reduce dimensionality and still get a very rich class of space-time 
data generated by models of type (ED- For instance, under Lie algebra conditions (see Filipovic and 
Teichmann [25] and Bjork and Svensson M) on the coefficients of (HU, it is well known that there 
exists a family of affine manifolds {Gt',0 < t < T} oi curves and a d-dimensional semimartingale factor 
process M such that 
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(1.2) rt{-) = gt{-,Mt);ro = h;0<t<T, 

where G = {gt{-;x);x G X C K^;0 < t < T} C E is a finite-dimensional parameterized family of 
smooth curves. We shall write it as Gt = + V where 1^ is a d-dimensional vector space generated by 

smooth curves and cj) is an iil-valued smooth parametrization which we assume to be a zero quadratic 
variation function. 

Two central unsolved problems in the stochastic PDE modelling are: (i) the construction of sta¬ 
tistical tests to check existence of Q and (ii) the development of related estimation methods. The 
importance of this research agenda can be mainly understood in applications to interest rate mod¬ 
elling and other term-structure problems in Mathematical Finance. The literature is vast so we refer 
the reader to e.g [iiiiiiiaiMiiiaiiiissiiiiiiiiisiiMiiiiiiTiiiiiis] and other references therein. 
In short, under the assumption of existence of G, the estimation of V is essential for a consistent 
calibration of potentially infinite-dimensional term-structure models. 

Under the assumption that the stochastic PDE dm admits an affine finite-dimensional repre¬ 
sentation (O, we apply the semimartingale PCA to estimate and identify components of invariant 
manifolds G which depicts volatility and drift dynamics in space. More precisely, let us consider the 
finite rank random linear operator Qt : E ^ V defined by 

Qrf ■= ),/) e ; f G E, 

where Qt{u, v) := [r(u), r(f )]t; u,v > R+, (•, •) e is the inner product of E and we set Q := range Qt- 
We notice the quadratic variation of the stochastic PDE dm is fully generated by Qt- In particular, 
the associated Hilbert-Schmidt norm 


dim V 

11 * 3 ^ 11 ( 2 ) = 

i=i 

fully describes the total amount of energy related to the quadratic variation of (11.111 over [0, T], Here, 
^ eigenvalues of Qt arranged in decreasing order. 

In general, dim Q < dim V a.s, but in typical situations we do have dim Q < dim uQ. Let Af be 
the complementary subspace of Q in V- Under mild assumptions, we have the following splitting 

V = QeAf as. 

In one hand, the pair of subspaces {Q,Af) should be considered as the analogous spaces to (W,V) 
but in the spatial variable. On the other hand, we stress that M is not observed and dia is treated 
as a factor model 


dim V 

(1.3) rt=(j)t+ ^ M^Xj;V = span {Xi,...,Xd} 

i=i 

with dimension d = dim Q -I- dim Af- The present methodology allows us to estimate and identify 
directions of the invariant manifold which come from the volatility (represented by Q) and the drift 
(represented by A/). More importantly, we are able to identify them separately which allows us to 
estimate null and non-null quadratic variation factors by projecting space-time data of the form dE3 
onto a pair of estimated vector spaces {Q 0 Af). We consider this separation feature as the most 
important aspect of the second part of this article. As a by-product, our methodology brings two 

^The empirical literature on interest rate modelling reports strong evidence of correlation among risk factors (see 
e.g OHslIIZ]) which suggest that one can typically find dim Q < dim V in case of affine models. From theoretical side, 
this phenomena is also related to no-arbitrage restrictions imposed on affine models. See e.g |14112611271 [Tl [^ and other 
references therein. 
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contributions to the field: It provides a consistent volatility dimension reduction and a method to 
estimate hidden pure drift components in space-time semimartingale data generating processes. 

Our methodology is a combination of classical factor models jointly with suitable random trans¬ 
formations over the space of latent semimartingales. More precisely, our approach consists essentially 
in two steps: Firstly, we apply an empirical covariance operator onto the space-time data to obtain a 
factor decomposition of the form 

k 

rt{x) t G n,a; G n' 

j=i 

in the spirit of discrete-type factor models (see e.g Stock and Watson [57], Bai [5] and Bai and Ng 0), 
but in a high-frequency setup as opposed to the usual panel data. In other words, 11 x B' is a refining 
partition of a two-dimensional set [0,T] x [a, &]. In linear structures, the covariance operator only 
neglects components of null empirical variance so that, under suitable conditions, our first step does 
not loose information from the invariant manifold V = Q® M. The second step consists in using the 
semimartingale PCA jointly with suitable random rotations of latent factor estimators Yt to infer the 
underlying semimartingale structure of the data. It is not easy to foresee that this two-step procedure 
would work. Indeed, to the best of our knowledge it is not known that the covariance operator 
decomposition is strong enough to provide a resulting process which is amenable to a consistent 
quadratic variation analysis. In fact, the sequence Yt is not even associated to a semimartingale, so 
that the quadratic variation analysis based on this two-step procedure must be considered in a broader 
sense. The proof that this strategy works is the content of the second part of the paper. 

It is important to stress the both steps in our methodology are equally important. For instance, 
the naive application of classical factor models to infer quadratic variation is non-sense when applied 
to semimartingale systems. Moreover, a more straightforward strategy based directly on an empirical 
quadratic variation does not work in full generality due to a possible singularity of the matrix. This 
last procedure forces the assumptions that dim V = dim Q which may not be optimal (in the mean- 
square sense) in typical situations when dim Af > 0. This is the reason why the two-step procedure 
in this work is implemented. For instance, Pelger [42] studies principal components directly from 
the empirical quadratic covariation for factors with jumps and discrete loading factors. One crucial 
assumption in his setup is the non-singularity of the quadratic variation matrix which restricts the 
applicability in multivariate systems with non-trivial correlation typically found in large portfolios 
and interest rate models. 

We should mention that another possible framework is introduced by Ait Sahalia and Xie [^ 
who interpret principal component analysis by means of the underlying volatility process. The main 
drawback of this strategy is the fact that rank of the volatility matrix may be strictly smaller than 
dim {M]t (as shown in Proposition 14.1|) . thus resulting in a substantial underestimation of dim W 
and their associated factors. Therefore, similar to Pelger [42], the strategy introduced by [5] does not 
recover in full generality the whole semimartingale stricture (A4) involved in the optimal decomposition 
due to a possible non-negligible dimension (dim V) associated to the drift. In addition, the strong 
assumption of simple eigenvalues imposed in [^ rules out many finite-dimensional semimartingale 
systems typically found in applications. 

1.2. Organization of the paper. The remainder of this article is structured as follows. Section [5] 
presents some notation and preliminary results. Section |3| presents the spectral analysis on a generic 
quadratic variation matrix. Section |5| illustrates the existence of bounded variation components in 
portfolio management and interest rate models. Section |S] presents the consistency results for the 
estimators of the dynamic spaces. Section |6| presents the application of semimartingale PCA to the 
problem of estimating finite-dimensional invariant manifolds for stochastic PDFs. Section |7| presents 
the numerical results and applications to real data. An Appendix is given in Section |S| which presents 
an estimator for dim Q. 
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2. Assumptions and Preliminary Results 

At first, let us fix notation. 

2.1. Notation. Throughout this article, we are going to work with a fixed stochastic basis of the form 
(n, J^t,1F,P) where (fl, J^t,P) is a probability space equipped with a sample space fl, a sigma-algebra 
Tt, probability measure P and a fixed terminal time 0 < T < oo. We equip the interval [0,T] with 
the Borel sigma algebra Bt and we assume the filtration F := {Jt;0 < t < T} satisfies the usual 
conditions. 

All the algebraic setup in this article will be based on the real linear space X'^ constituted by the 
set of all R'^-valued Bt x J^r-measurable processes. In this article, the most important subclass of 
X'^ will be the subspace S‘^ constituted by the set of all R'^-valued continuous F-semimartingales on 
(n,J^T)P)P)- When d = 1, we set S := S^,X := X^. We denote as the set of all R^-valued 
and J^t-measurable random variables for /c > 1 and t € [0,T]. Throughout this article, we adopt the 
following convention: If T G X‘^, then Yt is interpreted as a column random vector in R'^. Convergence 
in probability will be denoted by 

In the remainder of this article, 11 denotes a deterministic partition 0 = to < ti <...< tn = T 
and ||n|| := maxi<i<n\ti — The set Mpx, denotes the space of all p x g-real matrices and 
is the subspace oi p x p non-negative symmetric real matrices. The norm of linear operators between 
Hilbert spaces will be the standard Hilbert-Schmidt norm || • ||( 2 ) and denotes the transpose of a 
matrix P G Wlpxq;p,q > I. If A,B are two linear subspaces of X with A C B, then we denote tta 
the usual projection of B onto the quotient space B/A. Throughout this article, we omit the variable 
uj G n when no confusion arises. 

2.2. Analysis of quadratic variation matrices. In this work, the following bracket will play a key 
role in our analysis 


(2.1) [X, F]* := hm ^ (W, - - Ft,.,); 0 <t<T, 

in probability. 

Definition 2.1. The quadratic covariation {[A, F]t; 0 < t <T} exists for a given pair {X, Y) G X^ 
if the limit exists for every sequenee of partitions H such that ||n|| —>■ 0. We say that X G X has 

null quadratic variation if [X, X]. = 0 a.s 

Of course, {[X^,X^]t;0 < t < T} is a well-defined bounded variation adapted process for every 
(X^,X^) G S^. To shorten notation, we sometimes set [F] := [F,F] for Y G X. For a given X = 
(X^,... ,X'^) S X'^ and (t,w) S [0,T] x O, with a slight abuse of notation, we write [X]t(a;) S 
to denote the following random matrix 


(2.2) [X]tiuj):=[X\X^t{ujy, i, j = 1,..., d; 0 < t < T, a; G H, 

whenever the right-hand side of (1^ exists. 

In the remainder of this section, M = {M^,... ,M‘^) is a given d-dimensional measurable process. 
We say that M G X‘^ is truly d-dimensional if its components M^, are linearly independent 
over the vector space X. Throughout this paper, we are going to assume the following standing 
assumptions: 

Assumption 2.1. M is a truly d-dimensional measurable process. 

Assumption 2.2. The quadratic variation matrix < t < T} exists and if there exists i = 

1,..., d such that F{[M\ M^]t > 0} > 0 then we have F{[M\ > 0} = 1. 
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Remark 2.1. We clearly do not loose generality by imposing AssumMion \2. 1\ AssumMion \2. 21 is very 
natural since our theory relies on the study of a realization of the guadratic variation matrix, and thus 
it is necessary that we do not get a realization of null quadratic variation from a non-null quadratic 
variation process. 


Example: One typical example of semimartingale which satisfies Assumption 12.21 is given by the 
2(i-Heston model = 1,... with correlation in [—1,1] where E® denotes the zth square- 

root-type stochastic volatility component for i = i,... ,d. Then, one can easily check that for every 
t G (0, T], we have 


[M®, M®]t = f \Ml\‘^Vfds > 0 a.s for every i = 1,... ,d. 

Jo 

Hence, the classical Heston model satisfies Assumption 12.21 

Let A4t := span{M^, ..., M‘^} be the linear space spanned by the 1-dimensional measurable pro¬ 
cesses over [0,t] for 0 <t <T. Assumption 12.11 yields dim Alt = d for every t G (0,T]. 

Let us now split Mt into two orthogonal subspaces. At first, we set 

(2.3) Vt := {X G Mt; [A]t = 0 a.s}. 

Observe that Vt is a well-defined linear subspace of Mt for every t G [0,T]. More importantly, the 
following remark holds. 

Remark 2.2. We recall that any continuous bounded variation local martingale must be constant a.s. 
Moreover, for every t G (0, T\ 

{w; \Y,Y]t{uj) = 0} = {w G VL;N.{uj) = 0 over the interval [0,t]} 
where N is the local martingale component of the special semimartingale decomposition of some Y G S. 
Therefore, Assumvtion \2.2\ allows us to state that if M G S'^ is a truly d-dimensional process, then Vt 
is a subspace of Mt only constituted by continuous bounded variation adapted processes over [0,t]. 

Definition 2.2. Let Mt be the span generated by a truly d-dimensional measurable process M G 
over [0,t]. If dimX>t > 0, then we say that Mt has a null quadratic variation component over 
the interval [0,t]. In particular, if M G S‘^ and dimPt > 0, then we say that Mt has a bounded 
variation component over [0,1]. 

Let us give a toy example showing how a non-trivial dimension induced by bounded variation 
processes may appear in a very simple context. 

Example: Let R be a one-dimensional Brownian motion and let Mt = {Bt,Bt -\- t);0 < t < T. 
Of course, M is a truly 2-dimensional semimartingale where dim Mt = 2 for every t G (0,T]. In 
particular, we clearly have dim Vt = 1 for every t G (0,T]. 

For a deeper discussion of bounded variation components on semimartingale systems, we refer the 
reader to Section 3] Let us now provide a natural notion of “quadratic variation dimension” in Alt. 
To do so, let us consider the following quotient space 

(2.4) Mt := Mt/Vt; 0<t<T. 

By definition, Mt can be identified by {Mt,'^) where the equivalence relation is given by 

(2.5) X^Y^X — Y is a null quadratic variation process in Alt over the interval [0, t]. 
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The following simple result connects the rank of [M]t with the dimension of Ait■ 

Lemma 2.1. Let M G be a truly d-dimensional measurable process satisfying Assumvtion \2.‘A 
Then, 

rank[M\t = dim Ait ci.s 

fort€[0,T]. 


Proof. The result for t = 0 is obvious so we fix 0 < t < T. Let pt = dim Ait and let nj},. : Ait —>■ Ait 
be the standard projection of Ait onto Ait- Now, we observe that for each P G Ait, Trx>^{P) is a 
set of continuous measurable processes, each of which differs from each other by a continuous null 
quadratic variation measurable process over the interval [0, t]. Nevertheless, for each process in ttzi^ (P) 
its quadratic variation is equal to [T’Jt. Therefore, we may define its quadratic variation as [P]t. By 
the polarization identity, we may define 


( 2 . 6 ) MP)AvAQ)]t-=[P,Q]t 

for any P,Q G Ait- In particular, this shows that [N,Z]t is a well-defined random variable for any 

N,ZGMt- _ 

Since span {ttdj (M^), ..., ttdj = Ait, then n-Dt {M^),..., (M'^) contains pt linearly inde¬ 

pendent components in the vector space Ait- Therefore, dim Ait equals to the number of linearly 
independent components in the subset {TT-r>^{M ^),..., 

Let us now consider a subset of k equivalence classes where a : 

{1,..., fc} —>■ {1,..., d} is a function. Let ci,..., Cfc G M. In the sequel, we denote by 0 the null 
element of Ait- With this notation at hand, Cauchy-Schwartz inequality yields 


= 7? o ViV G TWt, 




= 0 a.s. 


In particular, 

k 




2 = 1 






= 0 a.s. 


By recalling that {tt^j..., is linearly independent if, and only if, 

k 

Ci7rx,j(M'^W) = if Cl = • • • = Cfc = 0, 

2=1 

then the statement {ttxjj ..., 7rx)j is a linearly independent set is equivalent to the 

system of equations 

k 

y^Cj ^ = 0 a.s, j = l,...,k 


has only the trivial solution ci = • • • = Cfe = 0 almost surely. In other words, 


(2.7) 


det 


7ri,,(M'"W),7rx,,(M'"(-^))j ;i,j = l,...,k] ^ 0 a.s. 


From (12.6L (12.71) and Assumption 12.21 we shall conclude the proof. 


□ 
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Summing up the results of this section, we arrive at the following direct sum 


(2.8) Mt=Wt(BVt,0<t<T, 

where {Wt;0 < t < T} is the unique (up to isomorphisms) family of complementary linear subspaces 
oi A4t which realizes (1^ . One should notice that Wt is formed by the null process in df on [0, t] and 
of elements V in Mt such that \y,V\t > Q a.s. Of course, Wt is isomorphic to Mt for every t G [0, T]. 
To shorten notation, in the remainder of this article, we write M := Mt, W := Wt, M := Mt and 
T) := 2 ? 7 ’. 


3. Random directions and principal components 

Let us start with some heuristics related to reduction dimension for a high-dimensional vector of 
semimartingales M = (M^,..., M^) G S‘^ which we suspect there may be some redundancy in the 
sense of quadratic variation. Perhaps there may be some way to combine ..., M‘^ that captures 
much of the quadratic variation in a few aggregate semimartingales. In particular, we shall seek 
random variables Vt = {v}, ..., vf) G L^’'^ such that 


d 

(3.1) St-.= Y.viMi 

i=i 

has the largest possible instantaneous quadratic variation over [0,t], where Vt = {v},... ,v^) in (13.11) 
is interpreted as a random coefficient at time t G [0, T] rather than a process. In other words, we seek 
a random linear combination of the form dO such that 


d 

has almost surely the largest possible value over the subset of with Euclidean norm 1 for a given 
t G [0,T]. Indeed, we do compute the quadratic variation of the linear combination S at time t by 
considering vt as a random constant over [0,t] which yields 


The random coefficient 


vlvi[M\M^t 






Vt 


argmax 


d d ~ 

. J t 


encodes the way to combine to maximize instantaneous quadratic variation at time t G 

[0,T]. The new variable - the leading principal component - is vlMJ:. We shall continue this 

strategy by seeking a possible lower dimensional pairwise orthogonal sequence of aggregate variables 
which might explain most of the quadratic variation at each time t G [0,T]. 

For simplicity of exposition, we assume that one observes all trajectories of a given truly d- 
dimensional continuous time semimartingale M satisfying Assumptions 12.11 and 12.21 Let us now 
interpret the eigenvalues and eigenvectors of the quadratic variation matrix in a similar manner of 
what we interpret in covariance matrices as in classical PCA. In the sequel, we introduce the brackets 
which encode quadratic variation of random linear combinations as described at the beginning of this 
section 
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(3.2) {XjY)t := ^ XlXi[Y\Y^f, Xt G h°/,Y G 

jj'=i 

The bracket {XjY^PjB)t is naturally defined by polarization. The bracket given in (13.21) encodes 
the quadratic variation of XjY at time t G [0,T] where Xt is considered as a random constant over 
[0, t\ in the computation of (j3.2ll . This is perfectly consistent to what happens in practice because at a 
given time t G [0, T], one observes a high-frequency data from a semimartingale M over [0, t\ and one 
has to decide if there exist linear combinations of the elements of Mt which summarizes the quadratic 
variation \M]t. 

Lemma 3.1. Let M G S'^ be a truly d-dimensional semimartingale satisfying Assumption 2.2. Let us 
eonsider the veetor of eigenvalues (w),..., A^(w) (ordered in sueh way that Aj (w) > Aj (w) > ... > 
\t{u))) of the matrix [M]t(a;) for each {u),t) G Lt x [0,T]. Then, for eaeh i, {A);0 < t < T} is an 
adapted bounded variation process. 

Proof. By the very definition, any eigenvalue Xt (uj) is a root of the characteristic polynomial p{X) = det{XI— 
[M]t{uj)) of the random matrix [M\t{uj). The degree of this polynomial is d and its coefficients depend 
on the entries of [M]t{uj), except that its term of degree d is always (—1)'^A'^. This allows us to con¬ 
clude that the ordered eigenvalues are F-adapted. In particular, by the classical Weyl’s perturbation 
theorem, we know there exists a deterministic constant C such that 

maXj\X(,(oj) - A;J(a;)| < C\\[M]t{uj) - [M]s(a;)||oo; (w,t) G LI x [0,r] 
where || • ||oo denotes the entrywise oo-norm of a symmetric matrix. By writing ||[M]t — [M]s||oo = 
maxi<j<dJ2i=i AL'^]t(a;) — [M^, M^]s{oj)\, we clearly see t >->• Aj(a;) has bounded variation for 
almost all w G Id. □ 

We are now able to summarize our discussion with the following result. 

Proposition 3.1. Let M be a semimartingale satisfying Assumptions \KT\ and, 12.21 For a given 
t G [0,T], let {Xl,...,Xf) be the list of eigenvalues of [M]t (arranged in decreasing order) and let 
{v(,..., vf) be an associated set of eigenvectors. Then, for every t G [0, T] 

{(viyM)t= max {XjM)t = X\ a.s 

; ||.^t liRd —1 

{{vt)^M)t = max {XjM)t = Xt a.s; k = 2,...,d 

Xt€V(,\\Xt\\j^,i = l 

where Vf := orthogonal complement of span {v), ..., Vt~^} in for k = 2,... ,d. In addition, if 
1 1 —>■ [AI]t is a generi^ smooth curve a.s and dim M.t = p is a.s constant over the time interval (0,r], 
then there exists a choice of adapted eigenvector processes (i;^,... ,v'^) over [0,T] such that 

Sl:={viyMt;0<t<T, 
is a semimartingale for each z € {1,... ,p}. 


^For a continuous real-valued function / defined in a neighborhood of to, the order of flatness mtQ (/) at to is defined 
by the supremum of all integers p such that /(t) = (t — to)^ 5 '(t) near to for a continuous function g. We say that two 
functions / and h meet of order > p at to when mtQ {f — h) > p. Let A(t); 0 < t < T be a parameterized family of self- 
adjoint matrices. We say that the curve t i-A A{t) is generic, if no two of continuously parameterized eigenvalues meet 
of infinite order at any t G [0,T] if they are not equal for all t. We refer the reader to e.g Rutter |45| and Alekseevsky, 
Kriegl, Losik, and Michor for further details. 
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Proof. Fix a realization lo G and t G [0, T], Let A = (atj) he a d x d matrix with entries given by 

Oy = i,j = l...,d. 

It follows from Assumptions 12.11 and 12.21 that A is a non-negative definite matrix. Now, let us take 
Vt S and let z = (zi,..., Zd) G be given by Zi = vl{uj). Then, 

{z,Az)Rd = {vj M)t. 

Now, the variational characterization of eigenvalues follows from standard arguments on quadratic 
forms over R." for each (w, t) G ilx [0, T], For the second part, if 1 1 —>■ [M]t{uj) is C°° then from Theorem 
7.6 in Alekseevsky et al [5], one can choose smooth versions for related eigenvectors u^(w),... ,v'^{uj) 
with bounded variation paths. By Gaussian elimination and Lemma lll.il one can readily see that one 
can choose it in such way that (w^,..., v‘^) is a d-dimensional adapted process. The usual integration 
by parts for stochastic integrals allows us to state that S = {S ^,..., S^) is a semimartingale. □ 

Similar to the classical PCA methodology based on covariance matrices. Proposition 111.11 yields a 
dimension reduction based on quadratic variation rather than covariance as follows. Let M be a truly 
d-dimensional semimartingale satisfying Assumption 12.21 and let us assume that one observes [M]t(w) 
for a given (w, t) G 17 x (0, T], Summing up the above results, we shall reduce dimensionality as follows 


d 

(3.3) SI = = l,---,dim Mt, 0<t<T. 

1=1 

At this point it is pertinent to make some remarks about (13.31) . At first, the assumption in Proposition 
o that dim Mt = p is constant a.s over (0, T] holds in typical cases found in practice. 

Remark 3.1. In order to get semimartingale prineipal components, the assumption that 1 1 — >■ [M]t is 
generic cannot be avoided. See e.g example 7.7 in Alekseevsky et al However, one should notice 
that if two eigenvalues meet at an infinite order at a time to, then all derivatives at this point must 
coincide. 

By the very definition, > ... > > 0 a.s for every t G [0,T] which means that S'* 

presents the ith largest quadratic variation among {S^,..., S^}. One should notice that the principal 
components are orthogonal in the sense 


{vl, [M]tvi)u<i = {{vD^M, {vlYM)t = 0 a.s;0 < t<T,i^j 
where SI = {vl)^M.,S^ = for i ^ j. Moreover, the Ath eigenvector v} must be inter¬ 

preted as the random direction in R*^ at time t which maximizes 

o s IklliR'^ = 1- 




over 


Remark 3.2. We stress that 

d 

[S\ H]t ^ {{H)JM)t = Y. vfvr[M\ M™]t; 0 < f < T, * = 1..., d 

^,m—1 

where S* = MA,0 < r < t, \ < i < d. Therefore, our methodology is rather different from 

Ait-Sahalia and Xiu [5]. In gualitative terms, our framework does not loose information in terms of 
the underlying guadratic variation space W (See Proposition o and hence in terms of M as well. 
In addition, we do not require a simple eigenvalue structure as required in [^. 

Let us now briefly discuss the importance of the subspaces {T>, W) in concrete multi-dimensional 
semimartingale systems. 
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4. Bounded variation component and quadratic variation in M 

In this section, we discuss two concrete examples of models which exemplify the importance of 
analyzing the principal components of high-dimensional semimartingale systems in terms of (W, T)) 
rather than covariance matrices. 

4.1. Correlation in d-dimensional asset prices. Correlation among asset prices is a well-known 
phenomena and it has been studied by many authors in the context of covariance and, more recently, 
quadratic variation matrices. Let us suppose the asset log-prices form a d-dimensional ltd process 


(4.1) 



., d; 0 < t < T, 


where b : [0, T] x 14 — >■ and tr : [0,T] x 14 — >■ satisfy usual conditions to get a well-defined 

d-dimensional semimartingale. For simplicity of exposition, let us assume that d is known. 

One typical example of the existence of bounded variation component in Ai is the occurrence of 
correlation among M^,..., M'^ which can be measured by volatility, i.e., quadratic variation. This type 
of phenomena has been recently studied by Ait-Sahalia and Xiu [1] who identify nontrivial correlation 
among {M*; i = 1,..., d} by means of suitable estimators [M% f, j = 1,..., d. In the presence 

of correlation among assets as in [4], the subspace 24 naturally emerges as a non-trivial subspace of 
M due to the fact that rank [M]t < d. See also Buraschi, Porchia and Trojani [TB] for a discussion 
of correlation in the context of optimal portfolio choice. 


4.2. Stochastic PDEs with finite-dimensional realizations. Let us describe how (W, 24) arises in 
the context of stochastic PDEs. Let us concentrate the discussion in one major research theme related 
to interest rate modelling: The calibration problem of Heath-Jarrow-Morton models [32] (henceforth 
abbreviated by HJM) based on forward rate curves. We refer the reader to e.g [T31 [T31|T31I23| and other 
references therein for a detailed discussion on this issue. The classical HJM model can be described 
by a stochastic PDE of the form 


(4.2) drt = (^A{rt ) + ctHJM{Tt)^dt -f 'y ^ {rt)dBl\ rg G 25, 

i=l 

where A = ^ is the first-order derivative operator acting as an infinitesimal generator of a Cq- 
semigroup on a separable Hilbert space E which we assume to be a space of functions g : R+ ^ R. 
The drift vector field anjM has great importance for pricing and hedging derivative products and it 
is fully determined by ct = {cr^,..., cr™} under a martingale measure. See e.g m for more details. 

One central issue in the literature is the use of the stochastic PDE (14.211 in practice. In this case, it 
is very important to know when (14.21) admits a finite-dimensional subset Q where the stochastic PDE 
never leaves as long as the initial forward rate curve ro € G, namely 

nn&G; ytG[0,T]} = 1 iiroGG. 

The subset G can be interpreted as a finite-dimensional parameterized family of smooth curves G = 
{G(-; x)-,x G Z C R'^} C E which can be used to estimate the volatility component of the model (14.21) 
starting with an initial curve rp S G- See e.g IZl HI]. Therefore, one central issue in interest rate 
modelling is the existence, characterization and estimation of G- See [laiiaiiiiEaEsiEaEiiiii 
miziiin] and other references therein. 

As far as the existence is concerned, Bjork and Svensson m and Filipovic and Teichmann [25] 
have shown that the existence of G is equivalent to 

dim {/i, CT*; i = 1,..., m}LA < oo. 
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in a neighborhood of rg, where fi is the Stratonovich drift induced by a and x i—>■ {/i, tr^,..., a^}LA{x) 
is the Lie algebra generated by the vector fields ... ,cr™. In fact, Q <Z E must be an afhne 

submanifold of E. In particular, there exists a parametrization (^ : [0,T] —> a truly d-dimensional 
Brownian semimartimgale M = (M^,..., M’^) and a linear subspace V = span {Ai,..., A^} spanned 
by a basis such that 


d 

(4.3) rt{x) = iptix) + Ml\i{x) a.s\0 < t < T;x > 0. 

i=i 

Under some assumptions (see e.g Duffie and Khan [H]), the semimartingale state process M can be 
generically written as an affine process. In contrast to the previous example of sample data from the 
d-dimensional semimartingale (14.111 . M in (14.3|) is not observed. 

For a given pair (M, V) as above, one can actually show there exists a unique splitting V = Vi © 1^ 
which realizes 


p d 

rt(x) = </>t(x) + ^KtVi(x) + 

i—1 

for 0 < t < T; X > 0. Here, {Y^;i = 1,... ,p} is a basis for W and {K-^; j = p + 1,..., d} it is basis for 
D such that 


M=W(BV. 

Moreover, Vi = span {pi,..., ipp] and V 2 = span {pp+i,..., pd}- The loading factors associated to 
V 2 are related to the risk factors in 2? which in turn are associated to no-arbitrage restrictions. 

Under the assumption that a stochastic PDE (one typical example is 63)) admits a finite¬ 
dimensional realization (031), we are going to present consistent estimators for the minimal invariant 
subspace V. More precisely, based on high-frequency data and techniques from factor analysis, we 
take advantage of the structure induced by (W,D) in order to provide consistent estimators (Ui,U 2 ) 
for (Ui, V 2 ) related to the minimal invariant subspace V. 

4.3. Noise dimension vs quadratic variation dimension. It is convenient to point out that the 
rank of a quadratic variation matrix is not the maximal rank of the underlying volatility process 
studied by Jacod and Podolskij m and Fissler and Podolskij |30] . See also Sahalia and Xiu for a 
similar framework. In fact, let M be a d-dimensional Ito process of the form 

Mt = Mo + f bsds + f OgdEs', 0 <t < T. 

Jo Jo 

Let Rt := su'Po<s<t xank (cs);0 < t <T where Cs := UsCrJ ;0 < s < T. 

Proposition 4.1. If a has continuous paths, then Rt < rank [M]t a.s for every t G [0,T]. Moreover, 
the inequality may be strict. 

Proof. Let us fix a realization a; G D in a set of full measure and some t in [0,T]. Let, also, Rt > 
0. Then, since c* is a continuous matrix-valued function, and the rank is an integer-valued lower- 
semicontinuous function, there exists t* G [0,t] such that rank Ct* = Rt. 

Since c** is a non-negative definite matrix, we can find a set of Rt linearly independent eigenvectors 
for Cf, say, wi,..., , with respective eigenvalues Ai,..., , such that Ai > 0, for i = 1,... ,Rt. 

Now, observe that if ci,..., are real numbers such that cf + • • • + c|j^ > 0, then by putting 
w = ciVi + • • • + cr^vr^ and using the orthogonality of the eigenvectors, we have 


( 4 . 4 ) 


{w, ct*w)-gd — c^Ai + • • • + Cr^Xr^ > 0. 
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Note also that, for any such vector w, the function 1 1 —>■ (lu, Ctw)^d is continuous, so we can find an 
open interval / containing t*, with length |/| = 26 (for some 6 > 0), satisfying 

(4.5) Vs G I, {w,Csw)Rd > l/2{w,Cfw)B.d. 


Furthermore, using the non-negative definiteness of Cg, we have that 
(4.6) VuG[0,r], (w,c„w)Rd > 0. 

Now, suppose, by contradiction, that rank [M]t < Rt- Then, we can find real numbers ci,..., cr^, 
with cf H h c|,^ > 0, such that, for w = ciVi -I- • • • -I- cr^vr^ , where vi,..., vr^ are the eigenvectors 
of Cf given above, we have [M]tw = 0, and, in particular, 

{w, [M]tw)T^d = 0 . 

Then, using 631), 63) and 63, we obtain 


0 = {w,[M]tw)^d = / {w,Cs 

Jo 


> J {w,Csw)^dds 

> j l/2{w,Ct*w)s^dds 

= 6{w,Ct*w)Kd > 0. 

This contradiction shows that Rt < rank[M]t. 

To show that the inequality may be strict, consider the following example: Let us assume that 
T > 1 and we take 

7(s) 0 

0 /(s-1);’ 

where f{t) = t{l — t)]l[o,i], with 1a is the indicator function of the set A. Then, clearly, 

'fis? 0 \ 

0 /(s-l)2j> 

and Rt = 1, for all t > 0, whereas rank[M]t = 2 for t > 1. □ 


Remark 4.1. The main message of the above proposition is that if a direction has a non-null quadratic 
variation for some time tg > 0, then this direction has non-null quadratic variation for all times t > to- 
This phenomenon does not occur with the volatility matrix Cg, as shown above. 

We also stress that Assumptions 12.11 and 12.21 yield the study of a statistical test to check the 
existence of a null quadratic variation component in A4. The full derivation of the statistical test will 
be further explored in a future paper. 

Corollary 4.1. Let M G be a truly d-dimensional process satisfying Assumption \2.S[ Let 
A;^,...,Ay be the ordered eigenvalues of the associated quadratic variation matrix [M]t such that 
Xr > ■ ■ ■ > Xr. The test Hq : Af. = 0 versus Hi ■. XJp > 0, is a well-defined statistical test and it is 
equivalent to Hq : rank[M]T < d versus Hi : rank[M]T = d. 

Remark 4.2. It is pertinent to interpret A4 = W 0 H from the perspective of semimartingale-based 
factor models. When rank [M\t < d then 


M = W®V 

where dim H > 0. In applications, one may think A4 as the space of high-dimensional portfolios 
composed by M which can be depicted into two dynamic spaces. When [M]t is singular, then the 
dynamic space has to be filled with zero quadratic variation dynamics which can be neglected only 
if one is solely interested in volatility. We stress that this phenomena is intrinsic to the principal 
component analysis of high-dimensional semimartingale systems. 
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5. Estimation of iyV ^ V ) 

In this section, we show how to estimate the pair (W,P) which realizes 


M = W®V 

for a given observed process M S X‘^ satisfying AssumDtions l2.ll and l2.2l The reader may think (W, V) 
as a pair of factor spaces which are not observed. We stress even if one observes all trajectories of M, 
the components of T) are not visible when dim 2? > 0. 

5.1. Identification of the Spaces {W,'D). Throughout this section, we are going to fix a truly 
d-dimensional process M = (M^,..., M‘^) € satisfying Assumption 12.21 Let At = W 0 P be the 
splitting introduced in (12.81) . We assume that dim W = p and dim T> = d — p, where 1 < p < d. In 
order to clarify the exposition, we first assume that one is able to observe all trajectories of a given 
M G X'^ in continuous time. 

Proposition 5.1. Let M = {M^,..., Af^) be a d-dimensional process satisfying A ssumvtions 1 2 . il and 
\2.‘A span {M^, ..., M‘^} = At and let [M]t be the quadratic variation matrix of M . Let {ui,..., Vd} 
be an orthonormal basis formed by eigenvectors associated to the ordered (decreasing order) eigenvalues 
of [MJt. Let V : n —>■ M^xd be the random matrix given by 


V(a;) := {vij{uj)-, l<i,j < d}. 

where Vi = {vn,... ,Vid)', 1 < i < d. Then there exists a set LI* of full measure such that for each 
realization uj G LI*, {(V(w)M.)®;p + 1 < i < d} is a basis for V and {(V(w)M.)*; 1 < i < p} is a basis 
for W. Moreover, 

(5.1) [(VM)i]r > [{VMf]T ... > [{VMf]T a.s. 

Proof. By applying the standard spectral theorem on [M]t{uj), we can find a set of eigenvectors 
{vi{uj); 1 < i < d} associated to [M]t{uj) which constitutes an orthonormal basis for so that 
V(w) is invertible for every uj G LI. Let p = dim W. If d — p > 0 then Lemma 12.11 yields Vi G 
Ker [M]t a.s\p + 1 < * < d. Therefore, [M]TVi is null a.s for every i G {p + 1,..., d} which implies 
that the last d — p rows of V(a;) o [M]t{uj) are null for every ui G LI* where LI* has full probability. Let 
us fix w* G LI* and we write V = V(a;*), Vi = Vi^uj*), (..., G X^, where = (VM)b 1 < i < d. 
Now, since V is invertible then {J^,..., J‘^} is a linearly independent subset of Al. Moreover, 

d 

^ Uij [M^, = 0 a.s, f = 1,... ,d;j =p + 1,... ,d 

f=i 

which by linearity implies that 


(5.2) 


i=i 


J T 


= Oa.s;f = l,...,d;*=p+l,...,d. 


More importantly, (15.211 yields VijM^ = 0 a.s;p+l < i < d. Since span {..., J‘^} C M, 

we actually have span {..., J"^} C V and the linear independence yields span ,..., = 

V. Therefore, 


(5.3) span { ..., = span{ J^,..., J^} ©T’cAI=>V©T>. 

Since {J^... ,JP} is a linearly independent subset of M, than (15.31) yields 


span{J ^,..., ,P} = W. 
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Lastly, the ordering (EH) is an immediate consequence of Proposition 13.II □ 

With the obvious modifications, we stress the result of Proposition 15.11 also holds over [0,f] for 
every 0 <t <T. 

5.2. Estimation of the spaces (W, V). Let us suppose that we are in the same setup of the previous 
section, but now we have a high-frequency of observations at hand from a truly d-dimensional process 
M = (M^,..., M'^) satisfying Assumption 12.21 In this section, the high-frequency data is assumed to 
be observed at common regular times for each ; i = 1..., d. We leave the case of non-synchronous 
data to a future research. Throughout this section, we assume the existence of a consistent estimator 
[M]j, for [M]t which satisfies the following assumption: 

Assumption 5.1. [M\j, is a sequence of non-negative definite and self-adjoint matrices such that 
[M]j, 4 [M]t as ||n|| ^ 0. 

In the sequel, we fix [M]j, satisfying Assumption 15.11 and we choose 0 any consistent estimator p 
for rank [MJ^. The goal of this section is to describe a generic estimation methodology based on the 
existence of [M]j, satisfying Assumption 15.II We stress the results of this section do not depend on the 
estimator of the quadratic variation matrix. We refer the reader to e.g [miiniisiiiMiisiiiHiiisiiH] 
and other references therein for a complete view of the estimation methods for [M]t. 

We need to define a metric notion on the set of finite-dimensional subspaces embedded on a possibly 
infinite-dimensional vector space. For this task, we make use of the same metric between subspaces 
defined by Bathia et al. m- Let Ml and M 2 be two finite-dimensional Hilbert subspaces of an 
inner product vector space H with dimensions mi and m 2 , respectively. Let {Cn, ■ ■ ■, Cimi} be an 
orthonormal basis oi Mi, i = 1,2. Then, we define 


(5.4) 


D{Mi,M2) 



1 

max{mi, m 2 } 


mi m2 


y^y^((C2j,cifc)g)^- 

fe=i j=i 


In the sequel, we need to compute distances for finite-dimensional subspaces which are not embedded 
in a natural common Hilbert space. For this reason, let A be a finite-dimensional linear space. If Ai 
and A 2 are finite-dimensional subspaces of A, then we define 


(5.5) d{Ai,A2)~D{<^{Ai),^{A2)) 

where <I> : A —>■ R"®; z = 1, 2 is the canonical isomorphism and dim A = m. One can easily check that 
d is indeed a metric over the set of all finite-dimensional subspaces of A. The metric d in (15.51) is very 
convenient to study consistency of subspace estimators. 

Before presenting the main result of this section, we need two preliminary lemmas. 

Lemma 5.1. Let Cn,C : LI —>■ M^xd be a sequence of self-adjoint real d x d matrices such that 
Cn ^ C as n ^ 00 . Assume that q = <AmiKer{C) a.s and let us denote by Vi,...,v]f a set of 
orthonormal eigenvectors associated to the q least eigenvalues of C'n- Let = span {u”,..., v^} and 
K = Ker{C). Then, 


D{Kn,K)^0 

as n ^ 00 . 


^For instance, if E||[M] 2 - — [M]x'||p < 0(rn) than choosing e ^ 0 in such way that e^{rn) ^ ^ cxd as n ^ oo allows 
us to take p = the number of non-zero eigenvalues of [M]^ bigger than e as a consistent estimator. 
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Proof. be an orthonormal basis for given by eigenvectors of C. Let vi,...,Vq be 

an orthonormal subset of eigenvectors of C associated to eigenvalues ai,...,ag and Ker{C) = 
span {ill,..., Vq}. To shorten notation, in the sequel we denote by (•, •) = || • || the inner product 
over Euclidean spaces. We may assume that 0 < q < d. Let {vq+i,..., Vd} be a basis for the orthogo¬ 
nal complement K^. At first, we notice that since Ar„ and K have the same dimension, it is sufficient 
to prove that D{Kn, K^) A- 1. This is equivalent to prove that 


To do so, let Qij — 
Therefore, 


4 0 

j=l i=l 

= (vf,Vq+i}vq+i, and note that Wq^jl 


< 1 a.s and Cqtj = {vf,Vq+i)aq+iVq+i. 


d d—q d d—q 

{Cq,„v]) = ^ = EE 

i—1 j — l i—1 j—1 

and since Cfq+iiii’j i^q+t))"^ > aq+i (44 conclude that 

d d—q d d—q 


d d—q 

- — 

^ 1=1 j=i 

^ d d—q 


< 


< 


-EEI4^^ii-ii^4" 
/.•—1 ^ — 1 


aq+l ^ ^ 

^ 1=1 J=1 

d d—q 

■ EE 11^4 


" '^v^W a.s 'in > 1. 


We now claim that 
(5.6) 


aq+l ^ ^ 
1=1 J=l 


sup IICull ^ 0. 


V^K; 

lhll=i 


Let af > a2 >...> aq he the ordered eigenvalues of Cn related to the q least eigenvalues. Let 7 „ be 
aon-zero eigenvalues of Cn. We have P{7n = d — q} = 1 for every n sufficiently large 


± ^ — — q 

the number of non-zero 
so that 


sup |1 C„t|| < a" 4 0 

veK„ 

lhll=i 


as n —>■ oo. 

On the other hand, Cn 4 C as n —>■ oo and hence 


sup IlCnU — Cf II 4 0 

wGK*’ 

lhll=i 

as n —>■ oo. Therefore, triangle inequality yields 

sup IICull < sup IlCnU - Cull-I- sup ||C„u|| 

V^Kn V^Kn V^Kn 

lhll=i lhll=i lhll=i 
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< sup \\CnV - Cv\\ + sup lICnLlll 

IId|I=i ii«ii=i 


0 


as n —>■ c». This shows (ESI and we may conclude the proof. □ 

Lemma 5.2. Let M G be a truly d-dimensional process satisfying Assum'otion \2.‘A Then, the set 
ker[M]( is deterministic^ for every t G [0,r]. 

Proof. For t = 0 the statement is obvious, so let us fix t £ (0,T] and let Vt be the subspace of 
Mt given by (|2.3I) . Let pt be the dimension of Aft- Let N^,... be a basis of Vt and let 

..., be a complement basis of A4t in such a way that {N^, ■ ■ ■, ,..., RL*} is a basis 

of A4t. Let A be the change of basis from {N^, ..., , R^ ,..., Rp*} to M = {M^,..., M'^} with 

matrix representation A = {{aij)i<ij<d}- We set Tl* := ft — O where O := {uj;rank [M]t{u!) ^ 
Pt or [N^]t{u}) > 0 for some i G {1, d — pt}}■ From Lemma [2T] and the definition of Vt, we know 
that n* has full probability. We pick oj G fl*. Of course, 


Ui . (uil, . . . , adl), • ■ . , Cld—p^ ■ {ai(^(i—pt)^ • ■ • 1 ^d(d—pt)) 

constitutes a set ot d — pt linearly independent deterministic vectors in and by the every definition 

d 

[M]t{u)ae = ^ au[M\ M%{u) = [M\ N%{oj) = 0 

fc=i 

for 1 < i < d — pt,l < i < d. Since ker[M]t{uj) C has dimension d — pt for every w G fl*, then 
ker[M]t(L>j) = span {oi,..., Ud-pt} for every uj G fl*. □ 

Let V be the orthogonal matrix formed by orthonormal eigenvectors of [M]rp. Of course, we are 
not able to prove that VM converges to VM due to the lack of identification of eigenvectors. What is 
true is the following notion of convergence. In the sequel, if {An,Bn',n > 1} is a sequence of random 
variables, then 


An h Bn as n —>■ 00 

means that, ¥{An < Bn) —?> 0 as n —>■ oo. We similarly define ^ and ~ Bn when both An ^ Bn 
and An ^ Bn as n —>■ oo. 

Theorem 5.1. Let M = be a process satisfying Assumvtions \2.1\ and \2.‘A Let [M]rp 

be a consistent estimator for [M]t satisfying Assumvtion 15.11 and let p be any consistent estimator 
for rank [M]t. Let V be the orthogonal matrix whose rows are formed by eigenvectors of If 

{J^,..., J'f') := VM., then let us define W := span {J^,...,Jp} and V := span { ..., J^^}. 

Under the above conditions, we have 

d{W, W) 4 0 and d{V, V) 4 0, 

as ||n|| —>• 0. If M \=W ®V then d{M,M) 4 0 as ||n|| —7> 0. Moreover, 

(5.7) [J^]t ^ ^ [J^]t 

(5.8) [4]t 0; p<i<d as ||n|| ^ 0. 


random set A is deterministic if there exists a subset A C such that A = B a.s. 
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Proof. Recall the definition of the isomorphism $ used in (15.51) . From Lemma [5.21 we have = 

Ker{[M]T) and by the very definition of V, we also have $(2?) = Ker{[M]rp). Thus, from Lemma lOI 
above, we have 


d{V,V) ^ 0. 

Now, notice that 

R<i = $(p) © $(>V) = ^(V) © $(W). 

Therefore, it follows from the definition of the metric d that 

d(W, W) ^ 0. 

Since p is an integer-valued consistent estimator, we shall assume that p = p. By the very definition, 
we know that 


({)y, [M]rpV^)g^d > [M]rpvff^)B^d a.s] 1 < i < d — 1. 

and 

[J*]t = {vtj [M]Tvf’)Rd a.s\ 1 < i < d. 

Let us write 

n - = ([JIt - (4, + {{vt, 

+ \M]^vf+^)^d - [J*+^]t); 1 <i<d-l. 

By construction, maxi<i<d\vT\ i® bounded in probability and —>■ 0 in probability as 

||n|| —>• 0. Moreover, [M]rpV^)^d — [M]j,vff^)^d) > 0 a.s and hence (15.7p holds true. The 

proof of (15.81) is similar. □ 

A straightforward consequence is the following result. 

Corollary 5.1. Assume that hypotheses in Theorem \5.1\ hold and letY € A4 be discretely-observed at 
{h^tr j 0 < r < n} over [0, T], where 0 = to < ti ... < tn = T. Then, there exists a = (a^,..., a'^) S 
such that 


(5.9) maxo<r<n 

as r^I^7" t^' — 1 | ^ 0. 


k—p-\-l 


^=1 


0 , 


Proof. Let us equip X with the topology of the uniform convergence in probability. Let TL be the 
smallest finite-dimensional subspace of X which contains {M ^,..., J^,..., J'^}. Let $ : "H —>■ 

be the canonical isomorphism for some m > 0. We notice that $ is actually an homeomorphism when 
% is endowed with the subspace topology. From Theorem 15.11 and the definition of the metric d, we 
know that 


(5.10) d{M,M) = D{<^{M),<i>{M)) = sup ||7i(x)?i - 4 0 

ll^^lljgd—1 

as ||n|| —>• 0, where Ta denotes the projection onto a closed subspace A C Then from (|5.1()l) and 
using the fact that <1) is an homeomorphism, we get the existence of a = {a ^,..., a'^) G such that 


p 


$(y)-> 


d 


=$(jfc) 


4 o 
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as ||n|| —>■ 0 which implies assertion in (15.9|) . □ 

Under the assumptions of Theorem 15.11 if F G is a discretely-observed semimartingale at 
{Ytk] 1 < k < n} over [0,T], then we shall use Corollary 15.II to estimate by OLS 

" I . 2 

a := argmin^ — VMt^ ■ , 

the regression coefficients which provide us the precise linear contribution of non-null quadratic varia¬ 
tion and pure drift components in W and (D, respectively. In this case, the following linear combination 

p d 

n-.= '£a^jl+ Y. = = 

i—1 r—p+1 

depicts {Ft^;0 < r < n} into elements of W © P over the sample {Yt^-,0 < /c < n} in [0,T]. The 
estimation of the factor spaces (Wj'D) provides a tool to optimal asset allocation/dimension reduction 
in high-dimensional portfolios composed by semimartingales, a topic which will be further explored 
in a future paper. 

6. Estimation of Finite-Dimensional Invariant Manifolds 

In this section, we apply the theory developed in previous sections to present a methodology for the 
estimation of finite-dimensional invariant manifolds related to space-time data generated by stochastic 
PDEs of the form 


(6.1) drt = {A{rt) + F{rt))dt+ > 0;ro = h G E, 

1=1 

where A is an infinitesimal generator of a Co-semigroup on a separable Hilbert space E which we 
assume to be a subspace of absolutely continuous functions g : K ^ M. where for simplicity of 
exposition we work with the one-dimensional spac^ set K = [a, 6] where —oo < a < x < b < +oo. 
The vector fields E,a'^]i = 1,... ,m are assumed to be Lipschitz and the dimension m is fixed. 

6.1. Splitting the invariant manifold. Let us now introduce the basic geometric objects related 
to the stochastic PDE (ED) that we are interested in estimating. We refer the reader to Tappe [48] 
for a very clear treatment of these objects. 

Definition 6.1. A family (Vt)t>o of affine manifolds in E is called a foliation generated by a finite¬ 
dimensional subspace V G E if there exists (p G C^(IR+;i?) such that 

Vt = m + U; t > 0. 

The map p is a parametrization of (Vt)t>o- 

Remark 6.1. We notice that the parametrizations of (Vt)t>o are not unique, but for any distinct 
parametrizations (jA and (jA we have (jAff) — (jA(t) G V for every t G [0,T]. 

In the remainder of this paper, (Vjjoo denotes a foliation generated by a finite-dimensional sub¬ 
space. 


^Indeed, it is not too difficult to extend the results of this section to the multi-dimensional case where iG is a compact 
subset of R"'. This type of flexibility is important to treat more complex space-time data such as volatility surfaces in 
Financial Engineering. 
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Definition 6.2. The foliation (Vt)t>o of affine manifolds is invariant w.r.t the stochastic PDE a 
if for every to € M+ and h € Vtg we have 


G Vto+t, for all t>0} = 1 


for To = h. 


The above objects lead us to the following definition which is the main object of statistical study 
in this section. 


Definition 6.3. We say that the stochastic PDE S6.1\) has an affine realization generated by a finite- 
dimensional subspace V G E if for each ho G dom {A) there exists a foliation (Vf°)t>o generated by 
V with ho G Vq° which is invariant w.r.t fop. ^ n affine realization with a generator V is called 
minimal, if for another affine realization generated by some subspace W we have V C W. 


Remark 6.2. Suppose that the stochastic PDE fop has an affine realization generated by a subspace 
V. We recall that for each ho G dom (A) the foliation (Vf”)t>o generated by V is uniquely defined. 
See e.g [Lemma 2.1 [48] /. 


See Section l4^ for a brief discussion on affine realizations in the context of Mathematical Finance. 


Throughout this paper, we assume that the stochastic PDE data generating process satisfies the 
following assumption. 

Assumption (Al): The stochastic PDE (EU has an affine realization generated by a finite-dimensional 
subspace. 

Let us now introduce the basic operators which will encode the underlying loading factors of 
the stochastic PDE that we are interested in estimating. We fix once and for all a terminal time 
0 < T < 00 , To G dom (A), the minimal subspace generator V of (16.111 spanned by linearly independent 
vectors {ici,..., Wd} and a parametrization (p € ^/([O, T]; E) with null quadratic variation [4>(u)\t = 

0; M G [a, &]. Under Assumption (Al), the stochastic PDE (|6.1I1 has a strong solution. Erom the 
reproducing kernel property of E C (//([a, 6];R), the evaluation map '. f ^ f{u) is a bounded 
linear functional and therefore point-wise evaluation of the stochastic PDE is well-defined for every 
point-space and the following representation holds 


( 6 . 2 ) 



where we set rt{u) := r^rt for 0 < t < T and u G [a, b]. Let us consider the following kernels 


m 


at{u, v) := ^ {rt){u)a^rt){v); 0 <t <T, 



The above kernels induce random linear operators Qt and at defined almost everywhere by 

(Qt/)(-) :=(Qt(-,),/)e;/GE. 


atfi-) := {ati-,),f)E; f&E,0<t<T. 
By the very definition, the random linear operator Qt can be written as 
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where we denote Q := Range Qt- In the remainder of this article, we denote by M the supplementary 
subspace of Q in the minimal subspace V. 

From Assumption (Al), we know (see e.g Th. 2.11 and (2.27) in [?5]) that there exists a truly 
d-dimensional semimartingale Z = (Z^,..., Z'^) which realizes the strong solution (|6.2p as follows 


d 

(6.3) rt{u) = (j)t{u) + ^ Zlwi{u); 0<t<T,uG [a, &]. 

Definition 6.4. We say that the stochastic PDE in fop admits a finite-dimensional realization 
(FDR) if for each h G dom (A) there exists a truly d-dimensional semimartingale Z G S‘^, a 
parametrization (f G C{[0,T]; E) and a linearly independent set {ici... ,Wd} C E which realize Id. ,91) . 

See e.g [HI [48l [24l US] for more details on this affine construction of the stochastic PDE. Repre¬ 
sentation (16.31) is not unique but it will be the basis for our splitting scheme as follows. At first, in 
order to apply the spectral analysis in previous sections, we will assume the following hypothesis on 
the stochastic PDE (16.11) : 

Assumption (A2): For each initial condition h G dom (A), there exists a factor representation Z 
which realizes (16.3p and it satisfies Assumption 12.21 

In the sequel, ii L G M^xd and r] = (iji,... ,r]d) is a list of real-valued functions on [a, 6], then 
ri{x) = ..., r]d{x)) G M^xi and we set Lr/ meaning the M'^-valued function x i—>■ Lri{x). 

Remark 6.3. Let rtiu) = -\- Hti Zlwi{u)] 0 < t < T,u G [a,b] be a representation of the FDR 

of {13). Ld ■'A. € ^41^ (X HjOti s^iThyuldT" T'OiTidoTfi iriQitT'zx- 

d 

(6.4) rt{x) = 4>t{x) -]-'^Yfipj{x);0 <t<T,x G [a,b] 

i=i 

where <p = (A~^)^w is a random basis for V and Y. = AZ. G 

We can actually write Qt in terms of any representation (lOD as follows 

d 

(6.5) (QTf)(u) = '^ (f,w^}E'Wj(u)[Z\Z-^]T; f G E;u G la,b], 
and, moreover, the following remark holds. 

Remark 6.4. From Lemma \2.1\. one can easily see that under Assumption (A2), any truly d- 
dimensional factor process realizing Id. A] (or \6.4\) ) will satisfy Assummtion \2.‘A 

In the sequel, we need to introduce new notation. For a given Z G satisfying Assumptions 
12.11 and 12.21 we denote M.{Z) := span {Z^,..., Z‘^}, M.{Z) := M{Z)fV{Z) where 'D{Z) := {X S 
M{Z); [A]. = 0 a.s on [0,T]} and the quotient space is defined by the equivalence relation (12.51) over 
[0,T]. We stress that M{Z),'D{Z) and M{Z) are A4,(D and M, respectively, which are defined in 
(1^ for the specific choice M = Z. 

In practice, we are not able to observe any semimartingale factor Z = {Z ^,..., Z‘^) of a stochastic 
PDE admiting a EDR. But it will be very important for our estimation strategy to identify the pair 
{Q,Af) in terms of the random matrix [Z]t, or more precisely, in terms of the quadratic variation of 
random rotations of Z. Next, we recall the following result. 

Lemma 6.1. Let r be the stochastic PDE {13) satisfying Assumptions (A1-A2) and admitting a 
FDR generated by the minimal foliation = {(>t -h I^};0 < t < T where dim V = d and tq = h. 
Then, we shall represent {13) as follows 
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p d 

(6.6) rt = (j)t+'^ yl^i + X! 0 <t<T, 

i—1 

where Y is a truly d-dimensional semimartingale Y satisfying W{Y) = span{Y^,... ,Yp}, "DiY) = 
span{YP '^^,..., Y^} and V — Q(B JY, where Q = span {(^i,..., ipp} and Af = span {pp+i ,..., Pd}- 

Proof. By assumption, there exists a truly d-dimensional semimartingale Z = {Z ^,..., Z‘^) satisfying 
Assumption 12.21 and a basis w = for V such that 

d 

n = ^ Zlwi]^ <t<T. 

From (E^, we have QplV a.s so that we shall consider the random operator Qt restricted to 

V as follows Qt : ^ x V ^ V. Moreover, from (16. Sp we readily see that the random matrix of the 
linear operator Qt is given by {[Z'^, ^^]t; 1 < j < d} for any pair {Z,w) of latent semimartingale 
representation Z and a basis w for V. By Lemma 12.11 we have dim Q = dim A4(Z) a.s. Let 

Y = {y^,...,y'^} be a truly d-dimensional semimartingale such that {Y^,... ,YP} is a basis for 
>V(Z) and {yP+^,..., Y^} is a basis for 'D{Z) where p = dim Q. Then spanfF^,..., Y'^} = M{Z) 
and Y satisfies Assumptions 12.11 and 12.21 Let / : A4(Z) —>■ M{Z) be the linear isomorphism given by 
the change of basis from Z to Y. If [1]^ = {a^; 1 < i, j < d} is the matrix of I, then we shall write 

d 

(6.7) rt = (/)t+^y,V,;0<t<r, 

i=l 

where pj := 1 ^ J ^ d. By writing Qt in terms of the basis {pi}'j^i and using ()6.7I) . 

we clearly see that Q = span {pi ,..., Pp}. By taking J\f = span{(^p+i,..., pd}, we then conclude 
(16.61) . □ 

The main message of Lemma [6.11 is the following. When the stochastic PDE is projected onto Q 
(A/”), then the associated latent factors are non-null quadratic variation (bounded variation) semi¬ 
martingales. We remark that the form of the FDRs ()6.6I1 has already been derived in Bjork and 
Landen |14) and Filipovic and Teichmann |26) in the context of HJM models. Lemma l6 .1 1 provides an 
explicit splitting for V by separating the loading factors which generate Q from its complementary 
subspace Af attached to their associated spaces WiY) and ViY), respectively. 

Summing up the above results, we arrive at the following identification result. 

Proposition 6.1. Let r be the stochastic PDE a satisfying Assumptions (A1-A2). For a given 
h £ dom {A), let =(/>t + Id;0<t<T he the minimal foliation generated by some V such that 
ro = h £ Vq . Let 


d 

j’i = 0* + X! 0 < t < T, 

i=l 

be a factor semimartingale representation, where V = span {rji ,..., rjd} and Z satisfies Assumptions 
HJ] a,nd \2.2[ Let A £ Mdxd be a nonsingular random matrix. Let p{x) = {A ^)^ri{x);x > 0 
and Yt = AZpO Y t < T. Let be the random matrix whose rows are given by 

Ci = Vi]l < i < d where {vi,... ,Vd} is an orthonormal eigenvector set of \Y]t associated to the 
ordered eigenvalues qi > q 2 > . ■. > Qd a.s. Then 


( 6 . 8 ) 


Q= span ^{Cp)i,... ,{Cp)p'^ a.s,Af = span l^{Cp)p+i,..., {Cp)d'^ a.s, 
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and 

(6.9) W{Y)= span^^{CYf,...,{CY)P^,N{Y)= span ^{CYY+\ ..., {CY^y 

Proof. This is a straightforward consequence of Proposition 15.11 Lemma l6.II and the identity 

{Zt,ri{x))^d = {AZt,{A ^)^?7(a;))Rd = {LAZt,C{A ^)^?7(a;))Kii, 

0 < t < T,x > 0 due to the orthogonality of the random matrix £. □ 

6.2. Preliminaries on Factor models. The goal of this section is to describe an estimation method¬ 
ology for the pair {Q,Af) which generates invariant foliations for stochastic PDFs of the form (16.111 . 
The methodology will be inspired by the so-called Factor Analysis developed in the Fconometrics 
literature (see e.g [47], i, i, [31jl. but with some fundamental differences: (a) Unlike the clas¬ 
sical discrete Factor Analysis, we are working with an underlying continuous time process sampled 
in high-frequency at discrete points in time and space, (b) The spaces {Q,Af) cannot be identified 
by applying standard techniques from Factor Analysis due to the rather distinct behavior between 
quadratic variation and covariance matrices in the high-frequency setup, (c) More importantly, the 
factor analysis introduced here allows us to reduce and rank the underlying semimartingale factors in 
terms of quadratic variation rather than covariance, including bounded variation components. 

Throughout this section, Assumptions (A1-A2) are in force. We also assume the underlying state- 
space E is the Sobolev space of absolutely continuous functions / : [a, 6] —)> R such that 

ll/ll|:=l/(a)P+ f\nxrp{dx)<oo 

J a 

where p is absolutely continuous w.r.t Lebesgue measure (see e.g [23]) and we write (■,-)e to denote 
the associated inner product. For simplicity of exposition, we work with the closed subspace of E 
formed by functions /(a) = 0 and we set ^ = 1. With a slight abuse of notation we denote it by E. 

We are going to fix the minimal invariant foliation V* = (ftt + V generated by a d-dimensional 
subspace V equipped with a basis {Ai,..., A^} and a truly d-dimensional semimartingale {Z^, ..., Z'^) 
satisfying Assumption 12.21 such that 


d 

( 6 . 10 ) rt = f>t + J2 

In this section, we work in a high-frequency setup as follows. To shorten notation, the points of 
partition in time (t")iLi and space (a:j^)jLi will be denoted by U = and xj = x^, respectively, 
and we set p(n) := sup]^<j<jj_]^ \ti+i — ti\ and 5{N) := sup]^<^-<jY-i l^j+i ~ ^j\- We will assume the 
samplings in time and space will be equally spaced and equidistant. For the sake of preciseness, it 
should be noted we are dealing with a sequence of refining partitions and we always assume that 
p{n) 0, S{N) —>• 0, h —>■ oo, iV —>■ oo as n, iV —>■ oo, where both n and N goes to infinity. 

We assume that the observations are generated by a space-time process 

(6.11) Xt{x) := rt{x) + et{x)-0 <t<T,x€ [a,b] 

where e represents a space-time error component satisfying some regularity conditions. In this section, 
we assume that one is able to sample the curves x i—>■ Xt{x) in high-frequency in time. For instance, 
term-structure objects like interpolated forward rate curves are examples of this type of data. See 
e.g [35] and other references therein. 

In particular, under Assumptions (A1-A2), the (h x 7V)-matrix Xt^{xj) of observations admits an 
affine noisy representation 
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d 

( 6 . 12 ) Xt,{xj) = (l>uixj) + ZlXkixj) + eu{xj) 

fc=i 

for i = 1,..., n and j = 1,^ N. Throughout this section, we assume that (j) is known by the observer 
and with a slight abuse of notation we write X for the difference X — (j). In matrix representation, we 
shall write 


X = ZA^ + Xi = AZi + £i; 1 < i < n 

where A := {Xj{xi); l<i<N,l<j< d}, X := {Xt^(xj); 1 < i < n,l < j < IV}, Z := {Z{.] 1 < i < 
n,l < j < d} and £ := {eti{xj); l<i<n,l<j< IV}. 

6.3. Estimating the underlying dimension. Obviously, the first step is to estimate the underlying 
dimension of the finite-dimensional realization. But this is an almost straightforward application of 
Bai and Ng [5]. Indeed, we are interested in solving the following optimization problem (for large 
n,N) 


i=l j=l 

where the minimum is taken over the set of real matrices with columns 

A" = ( 5 ', •••,/) e ; y{k) = (r(i),..., Y{k)) e m^x^, 

subject to either S{N)AjAk = h or p{n)y^{k)y{k) = Ik (Identity matrix in Mkxk)- Here 5 ® := 
( 5 ®(a:i),... , 5 ®(a:^))^ and Y (i) := {Ytj^{i),... ,Yt^{i))^ ioi 1 < i < k. The index k encodes the 
allowance of k factors in the estimation procedure. 

Remark 6.5. In order to avoid curse of dimensionality issues, we do assume k < min{fi, A^} and 
n,N ^ 00 jointly. 

The factor estimator is defined as follows. Let Y{k) € Msxfc be the random matrix defined by 
Yt,j{k) := 1 < j < k,l < i < fi whose the jth column 

is an eigenvector associated to the j-th largest eigenvalue of XX^ G Mfjxn subject to p(n)F^(fc)y(fc) = 
Ik- The loading factor estimator is given by A^ := p{n)X^Y{k) 

In the sequel, we denote 

n N 2 

V{k,Y{k)) ■.= m.\np{n)5{N) 

i=l j=l 

The estimation procedure for the underlying dimension of V is due to Bai and Ng [ 8 ]. They propose 
a class of information criteria of the form 

(6.13) PC{k):=V{k,Y{k)) + kq{n,N) 

for suitable penalty functions q{n, N). One can show the estimation of dim V can be still carry out 
on the basis of the ideas contained in [5] even in the high-frequency setup, as long as the following 
assumptions hold true. The following assumptions are inspired by Bai and Ng [5] and Bai [S] but in 
the context of a continuous time setup sampled at discrete times. For the sake of completeness, we 
list them here. In the sequel, is the space of g-integrable continuous Brownian semimartingales. 


(Dl) Z^ G for each j = 1,... ,d and 
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p{n)Y,ZuZZ ^ := 

in probability as n —>■ cx) and is a d x d positive definite matrix a.s 
(D2) supj>i ||A(xj)||Kd < oo and 




N .h 

<5(A^) A^(xj)A(a:j) — / {x)\{x)dx 

i=i 


( 2 ) 


as 5{N) —>• 0. Moreover, Ea := L x)\{x)dx is a d X d-positive definite matrix. 

(D3) The error process e satisfies assumptions: 

• Eetiixj) = 0, Esupjj \£ti{xj)Z‘ < oo 

• If 7Ar(tj,tj) := E{eti,etZv.f^5{N) then sup^ 7Ar(ti, <i) < oo and the sum p{n) l7iv(ti,tj)| 

is bounded in n, N. 

• sup^>i(5(iV)X;^m=iSupj |Eet,(xr„)£t,(a:^)| < oo. 

• sup„_^>i(5(iV)p(n)X;"j=iE^m|Eeti(a:f)£7(a:m)| < oo. 

• E 5^/‘^{N)J2f^i[£uixe)etZxe) -Eet,{xe)et^{xi)] . 

• The error e and the factors Z are mutually independent. 


(D4) 


N 


sup snp E 

n,N ts 


Vpin)S{N) i [£u{xj)£tAxj) - E[eu{xj)et,{xj)] 

i=l j=l 


R'i 


< oo 


supE 

n,N 


n N 

Vpin)S{N) EE Zti ^{xj )£ti {xj ) 

i=i j=i 


2 

< OO 

( 2 ) 


Remark 6.6. The assumption Rank Ez = d a.s is not strong. Indeed, since Ez is a Gramian matrix, 
then if Z does not satisfy (D1) then we shall reduce the effective dimension without losing information. 
More importantly, we stress that (D1) implies that factors satisfy Assumption 2.1 but it does not imply 
that [Z]t has full rank a.s. The fact that Sa is a positive definite matrix is equivalent to the fact that 
{Ai,... Ad} is linearly independent on the state space E equipped with the T^([a, b];]&.)-inner product. 
In contrast to the usual factor analysis (see e.g [iin];, we stress that Ez is random. 


In this case, under some mild growth condition on qin, N), 


(6.14) d := argmini<k<kmax PC{k) 

will be a consistent estimator for dim V, where kmax is an arbitrary integer such that d < kmax. 
The proof of this statement will be inspired by the arguments given by Bai and Ng [8] and Bai [9]. 
In one hand, in contrast to [8] and [3, our asymptotic matrix Ez is random and the sampling should 
be in high-frequency. On the other hand. Assumption D1 allows us to prove similar results without 
significant extra effort. For the sake of completeness, we give the details here. In the sequel, we denote 
CnN ■= min{d{N)~^/'^, p{n)~^^'^} and 

®Since we work with the subspace of functions f G E such that f(a) = 0, then {•, ■)d ,2 is indeed an inner product 
over E. 
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(N(ti,ti) := Zj.A^ete6{N)-, rjNite^ti) := Zj^A^et,6{N) 

for 1 < i, £ < n. 

Lemma 6.2. If Assumptions (D1-D2-D3-D4) hold, then 

(a) pin) YTt=i\{d)lN{ti,U) = 

(b) pin) YTt=iYuid)ONiti,ti) = Orij^j^yTTT^) 

(c) pin) T,i=iYttid)fNiti,ti) = Op(5(fV)i/2) 

(d) pin)J2^^-^^Yt,id)pNiti,U) = Qp( ^(jv)-D2c^^ )- 

Proof. Let Ln,N be the diagonal matrix of the eigenvalues of p(n)i5(A^)XX^ arranged in decreasing 
order. From (D1-D2-D3-D4), one can easily check that ||p(n)(5(iV)XX^||(2) = Op(l) and hence 
ll^ra.w||( 2 ) = Op(l). In this case, the same argument given in the proof of Lemma Al in [5] allows us 
to state that 


(6.15) Cl^[pin)J2\\YtM-HjZu\\l.)=Oril) 

i=l 

where Hd := SiN)A^AZ^Yid)pin)L~^^ G M^xd- Assumptions (D1-D2-D3-D4) together with 
(16.151) allow us to repeat the same argument given in the proof of Lemma A2 in [5] to conclude that 
the statement hold true. We omit the details. □ 

The next result was enunciated by Bai and Ng [5] in Lemma A3 (in the context of a discrete-time 
model and deterministic Yz) without a complete proof. For sake of completeness, we give the details 
here in our context. 

Lemma 6.3. Let Ln,N be the diagonal matrix of the eigenvalues o/p(n)i5(A^)XX^ arranged in de¬ 
creasing order. If Assumptions (D1-D2-D3-D4) hold then 

Ln,N A C := diag (ci,..., Cd) 

as n,N ^ oo, where (ci,... ,Cd) are the eigenvalues (in decreasing order) ofY\Yz. 

Proof. We follow closely the idea contained in the proof of Proposition 1 in [9] . By the very definition, 
pin)6iN)XX^Yid) = Yid)Ln,N a.s and hence 

(<5(iV) A^ a) (^pin)5iN)XXJ^ Yid) = ((5(iV) AA^) (p(n)Z^y (d)) Ln,N 

From the identity X = ZA^ -|- f, we actually have 

1/2 

((5(iV)A^A) (p(n)Z^Z) ((5(iV)A^A) (Z^F(d)p(n)) -f SnN 

(6.16) 
where 

Sn.iv := i5iN)A^ AY^^[pin)ilAz)pin)5iN)A^ E^Yid) + pin)5iN)Z^ SAZ^Yid)pin) 

(6.17) 


(5iN)AA^ 


1/2 


ipin)Z^Yid))Lr,,N 
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+ p{n)5{N)Z^ ££^Y{d)p{n)] = op ( 1 ) 

1/2 

due to Lemma 15^ Let Un^N '■= {5{N)K^ (p(n)Z^Z) and 

1^2 

:= (^(iV)AA^) {p{n)l7Y{d)). 

We shall write (16.161) as follows 

\Un,N Y = Eji j^Lji j^ 

where E^ is the pseudoinverse of En^N- Then each column of En,N is an eigenvector of Nn^N + 
Sn,NEn,NE ^Since En^NE^= Op(l) then (16.1711 and Assumptions (Dl, D2) yield 

\\Un,N + Sn,NE„,NElj^ - z^]!\2) A 0 

as n,N —>• 00 . By the continuity of the eigenvalues, we do have ||L„,Ar — C||( 2 ) A 0 as n, A^ —>■ oo. 
1/2 1/2 

Since YzY^ and Y\Yz have the same random eigenvalues, we conclude the proof. □ 

Lemma 6.4. Assume that hypotheses (D1-D2-D3-D4) hold and the eigenvalues ofYxYz G M^xd 
are distinct almost surely. Then, for every j = 1,... ,d, there exists a random vector (Gy,..., Gdj) 
such that 


( AA ■ • ■ ’ XI A/ ) A (Gy ,...,Gdj) 

\ t=i 1=1 ) 

as n,N —>■ oo. Moreover, the matrix G := (Gy)i<ij<d is invertible a.s and it is given by G = 
Ci/ 2 (j)Ty;^ 1/2 eigenvector matrix related to C subject to = Id a.s. 

Proof. By using Lemma 16.31 the proof is identical to Proposition 1 in Bai [5] even in the case when 
Yz is random. We refer the reader to the discussion in page 162 in [^. □ 

We are now able to present the following result. 

Lemma 6.5. Let us assume that assumptions (D1,D2, D3, D4) hold and let d = arg mini^f.^f.^^,^PG(k). 
Assume the eigenvalues ofY\Yz € M^xd are distinct almost surely. Then, limn,N=>ao^[d = d] = 1 if 
(i) q{N,n) —>■ 0 and (ii) GnNdiN,T) —>■ oo as n, N ^ oo where GnN = min{5{N)~^^^, p{n)~^/‘^}. 

Proof. The same arguments given in the proof of Theorem 1 in Bai and Ng [5] apply in our context. 

In particular. Lemmas 2, 3 and 4 in Bai and Ng [8] can be similarly proved in our context as well 
by using Assumptions (Dl, D2, D3, D4) and the fact that Y\Yz has distinct eigenvalues a.s. In 
particular, the fact that Ez is not deterministic is not essential for the validity of the analogous results 
of Lemmas 2, 3 and 4 given by [5] in our context, as long as rank Yz = d a.s (Assumption Dl). In 
particular, for fc < d let us define := {k)Zp{n)A^A6{N) G M^xd- In our context. Lemma 3 in 
[5] can be written as follows: There exists > 0 a.s such that 

liminf V(k,ZJk) - V{d,Z) = tr{Rk.Yx) =: Tk 

n,N—^c>o 

in probability, where Rk := Yz — EzIHIfe(lHl/EzIHIfc)“^lHl/Ez and Hfe := lim„_Ar->.oo Tfc exists due to 
Lemma 16.41 By construction rank Hlfc = k < d a.s. Assumptions (D1-D2) yield tr{Rk.Yx) > 0 a.s. 

By writing 


PG(fc) - PG(d) = V{k, Y{k)) - V(d, Y{d)) - (d - k)q{n, N) 

and splitting 

Vik, Y{k)) - Vid, F(d)) = [Vik, Y{k)) - V{k, ZJfe)] + [V{k, ZJk) - V{d, ZJd)] 
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+ [V{d,ZJd)-V{d,Y{d))], 

we shall use the same argument in the proof of Th 1 in [5] to conclude that 

lim F{PC{k) < PC{d)} = 0, 

n,N—>-oo 

for each k < d. If kmax > k > d, then similar to Lemma 4 in 0 , Assumptions (D1,D2, D3, D4) 
and the fact that the eigenvalues of G M^xd are distinct almost surely yield 

VikSm -Vid,Yid))) = OAC-^)- 

The rest of the proof is identical to the proof of Th 1 in 0, so we omit the details. □ 

6.4. Main Results. Let us now present the main results of this section. The following list of as¬ 
sumptions will also be in force throughout this section. 

(Qi) The eigenvalues of G M^xd are distinct almost surely. 

(Q2) We assume 


pin) 

l<^<s<n 

is bounded in probability for every k € {1,... ,d}. 

(Q3) 


pin) ^2 Puy^i^tf:^ ^:i)«-nS{N) A-0 

l<£<s<n 

as n, N ^ oo for each k,r,j € 

(Q4) There exists a sequence of natural numbers { 7 ( 71 ); n > 1} decaying to zero such that 

n 

E^||Ae,J|2,<5(iV) = 0(7(n)). 

(Q5) supq<« 7 ^ Iktill; is bounded in probability and for each i G { 1 ,... ,d}, 

pin) \ylyl\\\^tA\E\\stJ\E ^ 0 

l<i<s<fi 

as 71 —>• 00 . 


Remark 6.7. Assumption (Ql) is essential to our estimation procedure because it yields an asymp¬ 
totic Y G and a random basis for V which will allow us to construct a consistent pair of estimators 
(QjA/") for the splitting V = Q®M of the invariant manifold V. The technical conditions (Q2, Q3, 
Q5) are not strong since they impose a very mild growth condition on the eigenvectors o/X^X. As¬ 
sumption (Q4) is quite natural for error structures arising in space-time data generated by stochastic 
PDEs. For example, as far as the consistency problem of the HJM model (see section 4.2), assump¬ 
tion (Q4) means that the initial fitting method used to interpolate points which generates X cannot 
introduce an extrinsic volatility for the market. In other words, (Q4) rules out pure martingale error 
structures. 
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The starting point for the estimation of {Q^N) is to take advantage of the identities (16.81) and 
(lOD based on a quadratic variation matrix \Y'\t constructed from an asymptotic Y G satisfying 
Assumptions 12.II and 12.21 We define such process as follows: Let ^ be a factor representation of (|6.1I) 
satisfying Assumption (A2) and the (D1-D2-D3-D4-Q1) and let G be the associated matrix defined 
in Lemma l6.4l Since G is nonsingular a.s and is positive definite, then the random matrix matrix 
A = C~^GYx = e Mdxd given by 


d 

(6.18) Aij ='^c~^Gik / \k{x)Xj{x)dx] 1 < j < d, 

k=l 

is non-singular a.s. Then we shall apply Remark |6.31 to state that AZ G is a truly d-dimensional 
process and it is a factor measurable process realizing (16.41) for the basis (loading factors) (A“^)^A. 
From Remark 16.41 AZ satisfies Assumptions 12.11 and 12.21 

In the sequel, for a given factor representation Z of (EH) satisfying Assumption (A2) and the 
assumptions in Lemma Tb.41 we set Y = AZ. Let \Y]rp := {ixiik)i<:( ^<<1 {ixiik)i<i,k<d be 

the matrices given, respectively, by 

n—l 

(6.19) rhik ■■= Y. (Yu^udd) - Kdd)){Yu+^Ad) - Kk(d)), 
for 1 < £, k < d and 

:= [Y%Y'"]t] 1 < s,v <d. 

We stress that Y G X‘^ has a quadratic variation matrix in the sense of Definition 12.11 
Proposition 6.2. If Assumptions (Dl, D2, D3, Df) and (Ql, Q2, Q3, Q4) hold true, then 

\\[Y]T-[Y]T\\f2)^0 

as n, N ^ oo. 

Proof. At first, by taking n,N large enough, assumptions (Dl, D2, D3, D4, Ql) allow us to use 
Lemma EE and we assume that d = d because d is an integer-valued consistent estimator. In the 
sequel, if P is a real-valued process then we write At^P := Pu+i — ; 1 < * < u — 1. By using the 
definition of Y (d), one can actually write 


Yn (d) = hJ Zt^ + L^npA) Y Ytt (d) £/N{t£,ti) + 0N{t£, ti) -\- ^N{t£,ti) + riN{t£, ti)j 

=: HjZt,+RtAn,N), 

where Hj := Lf^qY^ {d)1p(n)Id‘ k5{N) and L nN is be the diagonal matrix of the eigenvalues of 
p{n)6{N)'K'K^ arranged in decreasing order (see Lemma EH). 

To shorten notation, we set Wt^ := HjZt,^ and (pN{te,ti) := ^N{ti,ti) + 9N{te,ti) + ^N{te,ti) + 
VN(ti,ti) for 1 < < n. In the sequel, for r,£ = 1, ...,d we denote Op.^(^r,i)iin) any random variable 

which is 0{^n) in probability, C is a constant which may differ from line to line and let us denote the 
d X d-matrix given by W := {wsq) where 


n—l 

Wsq :=Y^^tds)^Wtdq) 


for 5 , g = 1..., d. We claim that 
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(6.20) 

^^(Ai?-(n,A)) 4 0 

and 


(6.21) 

vec (IT) A vec ([Tj-r) 


as n,-/V —>■ oo, where vec is the usual vectorization operator. Let LnN = diag ( 71 ,... , 7 n). By the 
very definition, 

d n N 

^ Afe(a;„)Aj (x^)); 1 < i,j < d. 

k=l i=\ m=l 

By Lemma 16.31 we know that Ln^N ^ diag (ci,...,Cd) as n,7V —>■ 00, where (ci,... ,Cd) are the 
eigenvalues oi Y,\Ez- Then Lemma 15^ yields 

d d n— 1 

j—1 r=l i—\ 

d d d d 

^ EEEE ^j) L^{la,b]-,R)Cq ^Gqk{\m, Ar-)L 2 ([a,b];R)Cs ^Gsm[Z\ Z^]t 

j—1 r=l k—1 m—1 

= [r^r«]T;l<s,(z<d, 

as n, N ^ 00 . This shows that (16.211) holds. By noting that 


AYt,Ad)d^Yt,,k{d) 

( 6 . 22 ) 


AlTt, (fc) AlLt. {£) + AWt, {k)ARl (n, N) 

Ai?f^ {n, N)AWt, {£) + ARl (n, N)ARI (n, A); l<k,i< d, 


we only need to check 1)6.20p in order to conclude the proof. Let Sti{n,N) := Ln^NRuin, N) G M^xi- 
From Lemma [Hl?l we know that ||L“)y||( 2 ) = Op(l), so we only need to check that 

d n— 1 2 

(6.23) EE( A^™(n,A)^ Ao asn,A—>-oo. 

m—1 i—1 


At first, for each k G {1,..., d} we shall write 


n—1 2 

^ (A5,^^(n,A)) 


n— 1 n 

^HEE 

2=1 ^=1 


n—1 

+ 2p(n) E E y^^A^ipN{ti,ti)y^^AnpN{ts,ti) 

2=1 l<.^<s<n 

(6.24) =: ri(n,A)+T2(n,A) 

where Ai(p]^(ti, U) := (pN(ti, U) — <fN(tei ii-i); l<*<n — l,l<.^<n. We divide the argument into 
two steps. 

Analysis of Ti(n, N). It is sufficient to prove that 
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n—1 n 


pH EE + {Ai0N{ti,u)y 

i=l (=1 

+ {A^Niti,ti))^ + {A^riN{ti,U))'^ =Op{p{n)) 

for each k € {1 ,..., d}. In fact, a simple application of Cauchy-Schwartz inequality and the fact that 
Efci = 1 yield the following estimates 

n—1 n N n —1 N 


i=l k = l 

1/2 


(6.25) p{n)'^'^\yt,f{Ai'yN{te,ti)f < p{n)E ^ sup |£t, (a;^)|^5(A^) ^ ^ | Aet, (xfe)|^<5(iV), 

i=l i = l m = l ^ i=l I 

n —1 fi / fi—l n 

i=l \ i=l £^1 

( fi — l fi \ 

E E ) 

£^1 ) 

fi — l fi 

(6.26) + p{n) EE \yu? {{AaN{ti,ti)f + {5{N)eJ^Aetif^ , 

f -1 £=1 

fi — 1 fi d fi—l 

(6.27) Pin) J2 E lyifi^^^Nite, U))^ < Cp(n) sup |ki|||^<5(A) ^ ^ |AZ[, n|A.||^^f(A) 


f = l £=1 

fi — 1 fi 


r=li=l 

d 


(6.28) p{n) Y \yti\‘^i^iVN{te,ti))^ < Cp{n) Y IA£tJlKN<i(Af) ^sup |Zr|^||Ar||RN5(A) 


i=l £=1 


r = l 


The estimates (I6.25|1 . H6.26|I . (16.2711 and (I6.28|) allow us to conclude that Ti{n,N) = Op{p{n)). 

Analysis of T2(n, iV). The estimates for the crossing terms are more involved. Let us split T2in,N) 
according to the terms Ai7jv(i/, ii), AiO]^iti,ti), Ai^jv(i/,<i) and Aip]^iti,ti) as follows. To shorten 
notation, in the sequel we denote J(fc, n) = pin) J2i<e<s<n I- Cauchy-Schwartz inequalities and 
routine algebraic manipulations yield the following estimates 

fi—l 


phE E \yiAtlN{te,U)yt^Ai^N{ts,ti)\ < C'J(fc,n)(^sup ||et|lg;v<5(iV)^ 


1/2 


i=l l<^<s<n 


n—1 p 


1 1/2 rn-l 


(Esup |ki|||^<5(/V)) ^ ^ |AZ[J2|1A.||2^<5(A) 


t 

n—1 


^E||Ae,J|2^^(A) 


1/2 


phE E \yt^^^N{ti,ti)y’f^A,pN{ts,U)\ < CJik,n) E Op;(r,g)(l) 

l<.^<s<n r,q—l 


n—1 


^||Ae,J||^<5(A) 


n—1 


P 


1/2 


p(n) E E \yt^AieNite,ti)yt^Ai^Nits,ti)\ < CJik,n) E Op;r.(l)sup||e 

i=l l<^<s<n r—1 ^ 
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X 

We also shall write 

n—1 

ytA^^Nite,U)ylA,^N{.ts,U) = 

i—1 l<i<s<n 


' n—1 


n—1 




1/2 


^ OvXr,j)Wp{n) Vuvt 


r,j=l 


l<^<s<n 


X ^r)RN^{A[){et^, Xj)^NS{N) 


and 


n—1 n—1 

pwE e (j'i')ii'jl E J {}^i ^ ^ |||]^iv^(-^)? 

2—1 l<^<s<n 2—1 

n —1 / n —1 \ 

phE E \y^^AijN{te,ti)y^^Air]N{ts,ti)\ < CJ{k,n) J;||A£„||J,«(]v) , 

2 = 1 l<^<S<n \ 2 = 1 / 


n—1 p / 22—1 

P^’^^E E |Pt';A*?7w(2^,t*)yt';A,77Ar(2^,t,)| < C'J(fc,n) Y C>P;(g.r) 1 E 

2—1 l<£<s<n q,r—l \ 2—1 

The remainder terms in T 2 {n, N) are analogous. Summing up the above estimates, we conclude that 
T 2 {n,N) —>• 0 in probability as n, A —>• oo. From identities (I6.22|) . (I6.2d|l . (16.241) and (16.2011 . we 
conclude the proof. □ 

The next step is the analysis of the convergence of the loading factor estimators defined as follows. 
Let 


:=p(n)F^(d)XGM,-,^ 

and 


(Pi{x) := yjp{n)Yvlu^tk{x), ^k{x) := ((^ ^)^A(a;))/c 

fc=i 

for a < a: < 6,1 < / < d, 1 < fc < d. Since A G M^xd is non-singular a.s, then ')} is 

a basis for V for almost all w G fl. More importantly, 

d 

fc=i 

where Y = AZ eX‘^. 

Proposition 6.3. If Assumptions (Dl, D2, D3, D4, Ql, Q5) hold true, then 

d 

/=i 


as n, N ^ oo. 
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Proof. Let us fix i G {1,..., d}. Since d is an integer-valued consistent estimator for d, then we shall 
assume that d = d. Under (Dl, D2, D3, D4, Ql), {C.; 1 < i < d} is a well-defined random basis 
for V. By the very definition, 


n d n 

k—1 m—1 k—1 

n 

+ ^ \/p(n)?/t\etJa;) =: Ri^x) +R2,i{x),x G [a,&]. 

k^l 

Let us recall that for any / G U, we can compute the Sobolev norm as follows ||/|||; = supn X^sjGB 
00 where the sup is taken over all partitions If of [a, &]. See e.g Prop 1.45 in [33] for further details. 
If n = {sj}^i is a partition of [a, d], then 


E 




As,- 

s,Gn 


Sj GH k—1 

+ 2 p(n) 

l<fc<m<n 

= : Il,i + l2,i 

Since p{n)Y^{d)Y(d) = Id a.s, then (Q5) yields 


tk I 


12 I A gtfc fa) P 
As,- 


ytkytm / . 


As,- 

s,Gn •’ 


As, 


|U,i| < 


t=i 


0<t<T 


as n —oo. Cauchy-Schwartz inequality and (Q5) yield 


|/ 2 .*| < 2p{n) Y IdldlllktJblktJU ^ 0 

l<^<s<n 

as n —7 > oo. From Lemma lOl we know that (G^)“^ = ^ so that = G. Since {Ai,..., A^} C E, 

then we obviously have ||i?i,i(-) — Ci(‘)llE 2 a 0 as n —oo. This concludes the proof. □ 

In the sequel, p is any consistent estimator for dim Q based on X. See Appendix for details. 
Let C G be the matrix whose rows are given hy Ci := vpl < i < d, where {Di,... ,uj} is 

an orthonormal eigenvector set of the matrix (see (16.191) 1 associated to the ordered eigenvalues 
dl > 02 > • ■ • > Let us define 


(6.29) Z{. := j-th component of Alt,; 0 < i < n,l < j < d 

and 

n 2 

[Z^T := E {Zl - Zl_^) 

i=l 

over a sample 0 = to < U < • ■ • < = T. By the very definition, [Z^\t = 1 < j < d. 

Now we are able to present the main result of this article. Before this, we need an elementary 
lemma from linear algebra. 
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Lemma 6 . 6 . Let vi,...,Vd be a set of d linearly independent vectors in a real Hilbert space H 
with inner product {-)h and V = span{vi,... ,Vd}- Let T : V ^ V be an orthogonal matrix. If 
Ti,... ,Td is the Gram-Schmidt orthonormalization of vi,... ,Vd and wi,... ,Wd is the Gram-Schmidt 
orthonormalization ofTvi,... ,Tvd, then we have 

Wi = Ttp, i = I,..., d, 

Proof. The proof follows by just observing that for each v G V, HTuHij = ||u||//, and for each u G V, 
we have T (Proj^u) = ProjTviTu), where ProjyU = v{u,v) h/W^Wh- D 

Theorem 6.1. Let r be the stochastic PDE fop satisfying Assumptions (A1-A2). Assume the 
existence of a factor representation satisfying Assumption (A2) and (Dl, D2, D3, Df, Ql, Q2, 
Q3, Q4, Q5). For a given h G dom (A), let = (ft + V\Q < t < T be the minimal foliation 
generated by V such that tq = h G Vq and we set 

Af := span Q := span |(£p)i,..., (Tp)p|. 

Then, V = Q(B Af a.s and 


max{d{Af,Af),d{Q, Q)} A 0 as n,N ^ oo. 

Moreover, 

(6.30) \Z^]t > ■ • ■ > [Z^]t as, — 0, p+l<i<dasn, >■ oo, 


and 




as n, N ^ oo. 

Proof. From Assumptions (A1-A2), we shall fix a pair {Z, A) which realizes 


d d 

nix) = (ftix) + ^ Zi Xj{x) = (j)t{x) + ^ Yf£,j{x); 0<t<T 
i=i i=i 

where V = spanjAi,..., A^} = spanj^i,..., is a continuous semimartingale satisfying As¬ 

sumption (A2) and (Dl, D2, D3, D4), (Ql, Q2, Q3, Q4, Q5). Here, we set Y = AZ and 
f{x) = (Al“^)^A(a:), where A is given by (16.41) . From Remark 16.41 Y satisfies Assumption (A2) as 
well. 

To shorten notation, we abbreviate Gram-Schmidt orthonormalization by GSO. Let 


Af = span{{Cf)p+i,..., 

and 

Q = span{(£^)i,..., (£0i?}- 

Following the same lines as in the proof of Theorem 15. II and noting that (see Remark 1 6 .4 p 
^Af) = Ker{[Y]T) and 4>(A/') = Ker([Y]T), 

we obtain 

(6.31) d{Af,Af)^0, and d{Q,Q)^0, 

as n, N ^ oo. By using the triangle inequality, we obtain 

d{M,Af) < d{M,M) + d{M,Af), 


and from equation (I6.31L it is enough to prove that d{Af,Af) 


p 


0. as n, iV —>• 00 . 
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Let {^ 1 ,..., (p^} and {^i,..., ^d} be as in Proposition 16.31 Let n and N be large enough so that 
p = p and d = d. Let {ri,...,Td} and be the GSO of and 

respectively. Lemma 16.61 allows us to state that ..., Crd} is the GSO of {C^i, ■ ■ ■, C^d} and 

{£ti, .. .,CTd) is the GSO of {£^i,.. .,Cpd}- 

From the orthonormalization procedure, for each k < d, we have 

span{Cipi, Cipk} = span{CTi ,..., Crk} 

and 

span{C^i ,.. .,C^k} = span{CTi,.. .,£Tk}. 

Thus, 

J\f = span{CTi,..., CTp} and J\f = span{CTi,..., Crp}. 

Therefore, since $ is an isometry, we have 

d{MM)=D{MM) = 1- i^(((£r)„(£r),))2 a.s. 


Let us work with the quantity inside the brackets, and let us introduce some notation: Denote the 
matrix of £ by {fizj}, i.e., for any vector v S 


{Cv)i 


'D ■ 


Note that since the transformation C is orthogonal, we have 

^ ^ (^ik^jk — dij Q.S. 


Observe that from Proposition 16.dl we have that {Ti,Tj) A 6ij as n, TV —>■ oo. Since C is orthogonal 
and the set of orthogonal matrices is compact, the set {dij} is uniformly bounded in n and N, so 
that 


and 


as n, N ^ oo. 
Therefore, 


^ ^ dikdjp^Tk, Tp') 
k^p 


^ ^ ((T/e, Tfc) 1) 


0 , 


^ ^ '^p') ^ij 

= 

Q'ik^jp{Tk', Tpj ^ ^ 

^ik^jk 


k,p 


k,p k 





< 

^ ^ ((^fc? ^/c) 1) 

+ 

^ ^ ^ik^jpi'^k') '^p) 


fe/p 


as n, > oo, which thus implies that 
1 


^(((£r)i,(£r)j))2 = - ^ ^ d*fc%p(ffe, Tp) 




i,j \ k,p 
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and then, 


d{J\f,J\f) 0 


as n, —>■ c». The proof for the statement d{Q, Q) = 0 in probability follows from the same 

reasoning as in the proof of Theorem l5.ll Let us now check the ordering (j6.30ll . Let 6i > 1... > a.s 

be the eigenvalues of the self-adjoint non-negative matrix arranged in decreasing order a.s. By 
the very definition, 


e, = [Z^]t; 1< i < d a.s 

Moreover, the eigenvalues are continuous functions of the entries of the matrix and p = p and d = d 
for n, N large enough. Then, 


max 0i ^ 0 

p-\-l<i<d 

as n.N —>■ oo. This shows that (16.3011 holds. Lastly, by the very definition, the matrix of the random 
operator Qt : V ^ V computed along the basis {^i,... ,^d} is given by 1 < < d} S 

Mdxd- Therefore, ||Qt||( 2 ) = I|[^]t||( 2 ) Proposition 16.21 yields 

d d 

II[p]tII(2) = ^ II*5t||(2) 

1=1 

a.s n, N ^ oo. This concludes the proof. □ 


7. Simulation Studies and Applications 

In this section, we present some numerical results to illustrate the methodology developed in this 
article. 

7.1. Semimartingale PCA. In this section, we illustrate the estimation of the factor spaces (W, V) 
based on a finite-dimensional semimartingale system sampled in high-frequency. In particular, the goal 
is to illustrate Proposition 13.II In the simulation below, we assume that one observes a 4-dimensional 
semimartingale as follows: We consider a Markov diffusion 


dMt = p{Mt)dt -\- a{Mt)dBt 

driven by a 3-dimensional Brownian motion B = {B^, B^, B^) and the vector fields /x : —>■ and 

(T : R^ —>■ M4x3 are given by /x(a:i,..., X4) = {x2, — 2 xi + x^^x^, —Xi) and 


a{xi ,... ,X 4 ) 


/ 

1 

0 

X2 \ 


0 

1 

0 


0 

0 

X2 

V 

0 

0 

X2 / 


One can easily check W = {B^,B^,J M^dB^} and since M is a truly 4-dimensional semimartingale, 
then A4 = W (B 1) where dim 1) = 1. The observation times are taken to be equidistant: = 

k = 0,... ,71 — 1 where the total number of observations is n = 2000. The estimated factors 
in Figure [T] are ranked in terms quadratic variation (see Theorem and we clearly observe that 
J'^ identifies a null quadratic variation factor which generates B. The quadratic variation explained 
by the principal components are given by Table 1, where rji = 4 ^^^ / , 1 < * < 4 and 9i is the f-th 

^r=l 

estimated eigenvalue related to i-th estimated principal component JL 
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Table 1. Quadratic variation explained by the principal components 


m 

m 



0.7523 

0.9021 

0.9996 

1.0000 


Sample realization of M 



t 

A 


Sample realization of the estimated factors J 



t 

Figure 1. Estimation of a basis for W and V 

7.2. Variance versus quadratic variation. In this section, the goal is to illustrate that any naive 
attempt to implement standard factor models towards dimension reduction in term of quadratic 
variation is hopeless. For this purpose, we consider two very simple space-time two-dimensional 
semimartingales driven by a single Brownian motion B. 

Xt = BtXi{x) + {sin{15t) - Bt)\ 2 {x) 

Ut = BtXi{x) + {sin{3t) - Bt)X 2 {x) 

where i? is a one-dimensional Brownian motion, Ai(a;) = xcos{x) and A 2 (cc) = cos(a;) — xsin(a:);0 < 
X < 5, 0 < t < 2tt. In the sequel, the drift components are denoted by F^ = sin(I5t), F^ = sin (3t) 
and we set — B) and = {B,T^ — B). Let M{H^) and be the dynamic spaces 

generated by and H^, respectively. We clearly have 

M{H^) = span{i?} © span {F^}, M(H^) = span {B} © span {F^} 

Here, in the time variable, the observation times are taken to be equidistant: fc = 0,..., n— 

1 where the total number of observations is n = 2000. In the spatial variable, the observation times are 
taken to be equidistant: x'^ = /c = 0,..., n— 1 where the total number of observations is n = 31. 

The estimated pair of factors provided by the high-frequency factor model based on variance will be 
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denoted by (F^, F^). Here, F^ is the estimator of the leading factor component in terms variance. The 
estimated pair of factors provided by the high-frequency factor model based on quadratic variation 
will be denoted by (Z^, Z^). Here, Z^ is the leading factor component in terms of quadratic variation 
(see (lb.2911 b 

In Figure [3l we clearly see that the factor analysis based on second moments is not able to identify 
(H, r^). The leading component estimated factor F^ resembles a bounded variation process with large 
variance and the second estimated factor F^ is essentially the first one distorted by the Brownian 
paths in such way that the true pair (H,r^) is by no means identified. In strong contrast. Figure 
[3] clearly reports that the estimated pair (Z^,Z^) identifies the pair (i?,r^). We stress that in this 
two-dimensional setting, the true factors can be estimated up to multiplicative constants so that 
the results presented in Figures [3] and [3] shows a very consistent estimation of M{H^) by using our 
methodology. More importantly, the correct splitting and ranking in term of quadratic variation is 
fairly estimated. This numerical example illustrates the use of factor analysis based on variance to 
infer volatility (quadratic variation) does not have any sound basis even in a very simple space-time 
semimartingale model given by X above. 

Figure 0] presents the results for the model U. In this numerical experiment, the goal is to illustrate 
that null quadratic variation factors with large variance may be the leading component by using 
standard factor models in terms of variance. In Figure |H we report that F^ estimates well the 
Brownian component B responsible for the quadratic variation subspace of U, estimates well the 
bounded variation component F^ responsible for the null quadratic variation subspace of U. However, 
the correct leading component of the space-time semimartingale U is the Brownian motion and not F^. 
This simple example shows that prioritising components with large variance by using standard factor 
models may be completely superfluous in terms of quadratic variation. This shows that dimension 
reduction for semimartingale systems can not be accurately performed by using classical dimension 
reduction based on variance. 



Figure 2. Estimated factors of X. 


7.3. Estimating finite-dimensional realizations from a SPDE. Here, we illustrate our method¬ 
ology with some applications to space-time semimartingale models. The first example is based on a 
Markov diffusion 
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A 1 A 1 



t 

a 2 a 2 

Sample path realizations of Y , Z , and r 


a2 

- Y ~ 

a2 

" 7 " 




ecoo 

\j\ 

/ Vy 

V \r\J V^W/WvAA. 

vw 

1 

0 

1 

1 

1 

2 

I I I 

3 4 5 

I 

6 


t 


Figure 3. Comparation of the estimated factors of Figure ?? 


A 1 A 1 



t 



t 

Figure 4. Specification of the standard factor analysis for a null quadratic variation 
component as the leading semimartingale component in terms of variance 


dMt = ^{Mt)dt + a{Mt)dBt 

driven by a 3-dimensional Brownian motion B = {B^, B^, B^) and the vector fields ^ —>■ and 

cr : —>• M 4 x 3 are given by , X 4 ) = (a: 


cr(a;i,... ,a;4) = 


2xi 

+ a;3 

XAi 


( 1 

0 

0 

\ 

0 

X2 

0 

0 

0 

Xl 


VO 

0 

0 

) 
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where, 


4 

(7.1) rt='£MlK, 

i=l 

and Xi = X cos{x), X 2 = cos(a::) — x sin(a:), A 3 (x) = —2 sin(x) — x cos(a:), and A 4 (x) = xsin(x) — 3 cos(x). 
In this case, W = span V = span Q = span {Ai, A 2 , A 3 } and Af = span {A 4 }. 

Figure [S] shows the estimated factors of equation (17.11) by using the variance-based factor model Y 
and the PCA semimartingale Z developed in this paper. Clearly, the variance-based factor model 
is not able to identify the subspace V (and hence Af as well), while the PCA semimartingale does. 
In addition, Table 2 presents Xk ■= Ej=i Ehi where mjj is the (},j)-th element of the 
matrix \Y]rp (see (16.191) 1. Table 3 presents the variation explained by PCA semimartingale by means 
of fji = 4 "^ / , 1 < * < 4 where 9i is the i-th estimated eigenvalue related to i-th. estimated principal 

^r=l 

component . One clearly see the use of PCA semimartigale is more efficient than the variance-based 
factor model in identifying quadratic variation dimension. 


<> 


Al 


Sample realization of Y 


<>“ 


a2 


Sample realization of Y 


<>“ 


a3 


Sample realization of Y 




<>- 


a4 


Sample realization of Y 




a4 


Sample realization of Z 


C\J - 
O - 


Figure 5. Estimated factors for the space-time semimartingale (ED 


Dynamic distance between manifolds Let us now investigate the robustness of our methodology 
in the estimation of the minimal invariant manifold, say E, for a stochastic PDE. For this purpose, 
we consider the following objects: Let V = Q (B N he the estimator for V based on the entire sample 
{{ti,Xj)-,0 < i < n,h < j < N}. Let V-k be the same estimator but computed over the reduced 
sample {{ti,Xj);l = 0,...,h — = 0,...,N} where 1 < fc < .A and A is a fixed integer smaller 
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Table 2. Quadratic variation explained by the principal components: Variance- 
based factor model 


Al 

A2 

A3 

A4 

0.4795 

0.6912 

0.9886 

1.0000 


Table 3. Quadratic variation explained by the principal components: PCA semimartingale 


m 

m 


m 

0.7805 

0.9794 

0.9998 

1.0000 


than n. To compute the distance D between manifolds, we use the following approximation for the 
Sobolev inner product (b ■)e 


{f,9)a ■■= > -;/,ge e. 

^ Xj - Xj-l 

See e.g Prop 1.45 in [33] for more details. Based on this approximation for we perform the 

Gram Schmidt algorithm to orthonomalize V and V-k- We then use (15.41) computed in terms of (•, ■)a- 
We repeat the above procedure for k = 1,..., AT where K is a prescribed integer smaller than n. The 
idea is to compute 

(7.2) d{V, V-k);k = 5,10,15,20,..., 250. 

Under existence of a finite-dimensional invariant manifold, k i—>■ d(V, V-k) must be null as n, iV —>■ oo. 
In order to illustrate the invariance aspect of Theorem 16.11 we consider the following stochastic PDE 

3 

(7.3) dn = {Airt) + aHjM{rt))dt + ^ VdBl 

i=l 

where the volatilities curves are Ai = a:cos(a:),A 2 = cos(a;) — x sin(a:), A 3 (a:) = —2sin(x) — xcos(x) 
and A 4 (x) = xsin(x) — 3cos(x), tq = 0, A = ^ is the infinitesimal generator of the right-shift 
semigroup (S't)t>o defined by the action Stip{x) := ip(t + x). We set a = uhjm as the classical 
Heath-Jarrow-Morton drift (see Heath et al [32]). One can easily check that this HJM model admits 
a finite-dimensional realization of the form 

dZ^ = —Z^dt + dB} 

dZ^ = {-2Zl + Zl)dt + dBl 

dZl = {Zt-Zl)dt + dBl 

dZf = -Z]dt 

and a parametrization 

1 2 1 2 
= — - (xsin(x)-|-cos(x)) -I-- ((x-I-t) sin(x-I-1)-I-cos(x-|-t)) 


In this case. 
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4 

Tt = Z \ Xi 

i=l 

is the strong solution of (1731) . We compute (1731) for the model (ESI) which shows that it fluctuates 
between zero and 2.5 x 10“® so that we prefer do not report this numerical experiment in this section. 
More interesting than this is to illustrate (1731) in the presence of noise. For this purpose, we consider 
an observational process Xt{x) = rt{x) + £t{x) where et{x) = -^Wtsin (ttx) where Ut is a standard 
Gaussian variable for every t > 0 such that Ut is independent form Us whenever s ^ t. Figure [5] 
illustrates that the presence of noise may lead to an erroneous analysis for the existence of a finite¬ 
dimensional invariant manifold for the stochastic PDE (ESI). As the backward lag increases the 
distance between manifolds increases as well with short periods of stability. 


Distance between the manifolds 



Figure 6. Dynamic distance (E2D: Finite-dimensional realization with noise 


7.4. Application to real data sets. In this section, we illustrate the theoretical results of this article 
with an application to a real data set. We consider the UK nominal spot curve obtained by the Bank 
of England with maturities .5 to 25 years (50 maturities) and daily data ranging from 27 May 2005 to 
9 October 2007 summing 601 observations. We postulate an affine structure in the data, for instance 
finite-dimensional realizations for a SPDE data generating process. The first task is to estimate the 
underlying dimension of the affine manifold. The penalty function used in (|6.13l) to estimate the 
number of factors is given in page 201 of [5]. Any of these penalty functions produce identical results 
for the estimation of the underlying dimension. The statistics d (given by (16.141) estimates seven 
factors for this data. In order to estimate the dimension of the quadratic variation space Q, we make 
use of the Fourier-type estimator introduced by [33] . Under the assumptions of Proposition 18.11 in 
Appendix, we take e = n~'s in Corollary 18.11 in the estimation procedure. The estimation indicates 
dim Q = 6, so that dim V = 1. Figures |7| and |HI report the time series of the estimated factors {Y, Z), 
where Y denotes the variance-based factor estimator and Z is given by (16.291) . 

Figure|S|and the estimation dim TX = 1 strongly indicate the presence of a non-trivial drift dynamics 
in the data. In particular, the estimated factor with smallest variance is not able to identify the 
drift while Z"^ seems to estimate a bounded variation curve subject to small errors due to observational 
errors or microstructure effects. 
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In order to compare our methodology with the standard factor model, we also perform a principal 
component analysis in two different versions. In Table 4, fji = 1 < * < 7 and Oi 

is the i-th estimated eigenvalue related to the PCA semimartingale estimated principal component 
Z® given by (16.291) . In table 5, Xk ■= I where rhjj is the (j,j)-th element of the 

matrix [Y]j, (see (16.191) 1. The first PCA semimartingale component already explains 50 per cent of 
the total variation while only the third classical factor approximates half of the quadratic variation 
contained in the data. 

Figure [5] reports the dynamic distance (17.21) of the estimated manifold V over the entire period 
of our sample against V-k where k = 5, 10,15,.. .200. As the backward lag increases, the distance 
increases as well but we observe some periods of stability over time. 




Figure 7. Time series of the estimated factors 
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<N 

LD 

<>- 


a5 a5 

Time series of Y and Z 



a6 a6 


Time series of Y and Z 



a7 a7 

Time series of Y and Z 



Figure 8 . Time series of the estimated factors 
Table 4. Quadratic variation explained by the principal components: PCA semimartingale 



m 






0.5010 

0.7152 

0.8548 

0.9471 

0.9925 

0.9999 

1.0000 


Table 5. Quadratic variation explained by the principal components: Variance- 
based factor model 


Ai 

A2 

A3 

A4 

A5 

Ae 

A7 

0.1161 

0.4553 

0.4911 

0.6198 

0.8927 

0.9996 

1.0000 
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Distance between the manifolds 



Figure 9. Dynamic distance (ESI) of the UK nominal daily spot curve 


8. Appendix: Estimating dim Q 

In this section, we give a concrete alternative for estimating p = dim Q which is an important com¬ 
ponent in Theorem 16.II From (16.5L we notice that if the stochastic PDF admits a finite-dimensional 
realization, then the matrix of the finite-rank linear operator Qt is given by [^]t whenever 

d 

n = (t>t + <t < T, 

4 = 1 

for each basis {Ai,..., A^} of a finite-dimensional subspace which generates a finite-dimensional real¬ 
ization. Rigorously speaking, we cannot estimate directly dim Q through high-frequency sampling of 
factors because they are not observed. In this case, one has to work with a high-frequency sampling 
of observed curves subject to noise. This section provides a feasible estimation procedure for this. 

We choose to work with the Fourier-type estimator proposed by Malliavin and Mancino m but 
we stress that other quadratic variation estimators can be certainly used as well. The strategy is to 
find the minimum requirements on the residual process in such way that one can estimate the random 
operator Qt via an observed curve process 


Xt {x) = Tt (x) + St(x);0 < t < T, a < X < b. 

If e is negligible in the quadratic variation sense, then the method of the estimation of the kernel 
Qt{u,v) is fully based on any reasonable non-parametric estimator of the integrated volatility. We 
assume that A is a well-defined semimartingale random field. 

In the sequel, without any loss of generality we assume that [0,T] = [0, 27 r]. Let 11 = = 

0 , ..., h} be the instant times of observations and p(n) = maxo</t<fi-i \th+i ~ ^7 l\- section, we 

assume that p(n) —>■ 0 as n —>■ cxd so we are able to sample the curves x ^ Xt (x) in high-frequency in 
time. In the sequel, we make use of the following notation 


ipn{t) := snp{tl-tl < t} 


For given positive integers M > 1 and n > I, we define 
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\s\<M 


n—1 


= X! -t])AXtn_^^{u)AXtn^^{v), u,v e [a,b], 

(,j=o 

where AXq^_^ (u) := Xt^^^ (u) - Xt^ (u) and 


dM(i) := 



1; i = sT, s G Z 

1 sin[{M+1 /2)t\ _ , / rp 

2M+1 sin(t/2) ) 


is the normalized Dirichlet kernel. Here M encodes the Bohr convolution product and n the dis¬ 
cretization level of the Fourier transform of at{u, v). See Malliavin and Mancino [33] for more details. 

To keep notation simple, from now on we set ti := t”;0 < z < n. The kernel Qt{u,v) induces a 
random linear operator Qt on the complexification Ec as follows 


{.QTf){u) = (QTiu,-), f)Ec 

^ n—1n—1 

= ^ E EE exp(is (4 - te))AXt^^, {u){AXt,^,, f)Ec, 

|s|<M e=0 k=0 

for / G Ec- The reason to consider E on the field C is due to a nice representation as follows. If A and 
B are two linear operators on Hilbert spaces then AB and BA share the same nonzero eigenvalues. 
Furthermore, if 7 is an eigenvector of BA, Aj is an eigenvector of AB with the same eigenvalue. So 
the strategy is to write Qt = AB in such way that BA : —)> for some p > 1 and therefore one 

can easily relate the eigenvalues of BA to AB. In fact, by the very definition of Qt we have 

Qt = AB 

where B : Ec —>■ is defined by 

C n-l rt-l \ 

^exp(-i(-M)tf)(AXt,^i,- )i;j,,...,^exp(-i(M)t^)(AXt,^^,- )bJ 
1=0 e=o / 

and A : —>■ Ec is defined by 

- n— 1 

( 8 . 1 ) (Hx)(-) ■= 2m +I E ^ 

|s|<M fe=0 

By the very definition, Qt := BA : —>• c 2 M+i is given componentwise by 

^ n—1 

Us exp (i(stfc - mte)){AXti_^,^,AXt^,^-,)Ec, 

|s|<M k,e=l 

for y G m = —M ,..., M. We then arrive at the following elementary result. 

Lemma 8 . 1 . The random linear operators Qt and Qt share the same nonzero eigenvalues in C. Let 
p be the number of nonzero eigenvalues {9i = I,... ,p} of Qt and let ..., 'yj(M)), j = 

1,... ,p be the corresponding eigenvectors in Then 
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1 


2M + 1 


exp(is4)AXtj^^j I, j = l,...,p 


|s|<M \ fe=0 / 

are the p eigenfunctions of Qt ■ 

Proof. Let 9j G C be a nonzero eigenvalue of Qt and let 7 j G corresponding eigenvector. 

Then 

( 8 . 2 ) QtIj = and Qt^Ajj) = OjA'jj a.s, 

where A is the operator given by (18.11) . By writing (18.21) component by component we have 


1 


n—1 


2M+ 1 


E E 7i(s)exp (i(stfc - rti)){AXt,^^,AXt^_^QEc = OjliiQ a.s 


|s|<M k,t=0 

for r = —M,..., M. On the other hand, 


A'jj = 


1 


' n— 1 


2M + 1 

|s|<M \fe=0 

This concludes the proof of the Lemma. 


E E ’ j = a.s. 


Remark 8.1. Let 


(8.3) 


1 


2M + 1 


E E ’ J = a.s 


|s|<M 


, fc =0 


□ 


be the eigenvectors of Qt related to its nonzero eigenvalues {9j\j = l,...,p}. Since Qt is a self- 
adjoint finite-rank operator then the following spectral decomposition holds a.s 


Qrf = ^ di{f, i>i)Eci>i\ f G Ec, 


i=l 

where p < 2M + 1 a.s for every n,M and {pi'A = l,...,p} is an orthonormal set by applying a 
Gram-Schmidt algorithm to the functions given by JO) . 

Let us now introduce the basic assumptions on the residual process e. Since the estimation is based 
on a high-frequency sampling we need to impose some structure on the continuous-time dynamics. 

(Bl) The residual process e is an Ito semimartingale field where the drift component h satisfies 

sup \\ht\\E^LP 

0<t<T 

for every p > 1 . 

(B2) The quadratic variation of £{u) at time T satisfies 

/ [£{u),s{u)]Tp{du) = 0 a.s. 

Jr 

(B3) The following growth assumption holds 

0 < liminf Mp{n) < limsupM/ 9 (n) < oo. 

n,M^oo n,M —>00 
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(HI) The vector fields F,a'' : E ^ E are globally Lipschitz for each i = 1,... ,m. 

(H2) Linear growth condition on the vector fields E, ; 1 < i < m in (I5TD : There exists a constant 
C > 0 such that 


\\Fix)\\% + < <^^(1 + Iklll) for every x&E. 

i=l 

Remark 8.2. As far as the consistency problem of the HJM model (see Section ^.2), Assumption (B2) 
means that the initial fitting method used to interpolate points which generates X cannot introduce an 
extrinsic volatility. The interpolation must be chosen in such way that the resulting observed volatility 
on the whole curve must be fully dictated by the market and not to the particular choice of fitting. 
See also Assumption (Q4). The semimartingale decomposition yields the following structure on the 
residual process 


£t = e + / hsds 

Jo 

for some Eq- measurable random variable e = Xq — ro and an integrable adapted process h satisfying 
(Bl). Assumption (B3) is a technical assumption in order to get optimal bounds but it is also linked 
with different flavors between the exact Fourier estimator and the usual quadratic variation estimator 
for Qt . See Malliavin and Mancino |39] and Clement and Gloter [20] for further details. 

Under (HI) and (H2), it is well-known for every initial condition f G E, there exists a unique mild 
solution Tj of the stochastic PDE. Moreover, the following integrability property holds 

E sup ||r^|l|;<oo 

0<t<T 

for every q > 1 and f G E. 

The following result is a functional version of (and almost straightforward consequence) of Propo¬ 
sition 1 and Lemma 3 in [20) . In the sequel, || • ||( 2 ) is the Hilbert-Schmidt norm operator over Ec and 
to keep notation simple, we write || • || = || • 

Proposition 8.1. Assume that (Al, A2, Bl, B2, B3, HI, H2) hold and in addition 
(8.4) sup \d„E{rt){-t GL\pi) 

0<t<T 

for each j = 1,..., m. Then 


e||Qt-Qt||^2) = 0(p(^))- 

Proof. In the sequel, we denote by C a positive constant which may differ from line to line. We also 
decompose 


into 


Xtiu) = niu) -Gitiu), 

pt 

ft(u):=^ / E{rs){u)dBl; 0<t<T, uG[a,b], 
j=i 

St{u) = / ^s{u)ds 

Jo 
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where 

^tiu) := Art{u) + F{rt){u) + ht{u);0 <t<T,uG [a, b]. 
For a given {u, v) S [a, b] x [a, b], integration by parts and (B2) yield 

Qt{u,v) - Qt{u,v) = [ ( [ dM,n{^,t)dXe{u))dXt{v) 


10 ^Jo 
i-T , i-t 


+ 


( J dM,ni^,t)dXi{v)jdXt{u) 


where 


Jo ^Jo 
=: Rn,Miu,v) 




i=l 


i=l 


and 


w) := J dM,n{i,t)dfe{u)'jdrt{v), 

Jn,M,2{u,v) ■.= j dM,n{£,t)dfe{v)'jdrt{u), 

■.= J (^j dM,n{£,t)dse{u)'jdft{v), 

In,M,2{u,v) — J j dM,n{i,t)drt{u)'jdet{v), 

-^n,M,3(w, v) : = I ii: dM,n(i, t)det{u)^det(v), 

where In,M,i are the symmetric quantities w.r.t In,M,i- By the very definition 

WQt - Qt\\%) = IKQt - ( 5r)(0, Oll^ + f duiQr - Qt){u,-) ^i{du) 

J[a,b] 

= |Qr(0,0) - Qr(0,0)p + / |5«(3 t( 0,zi) - 9„(5 t(0, 

J[a,b] 


[a. 6] 


,,b]2 


\duQT{u,0) - duQ t{u, 0)1"^pL{du) 


\dyuQT{u,v) - ^yyQT{u,v)\‘^^l{du)^^{dv) 


=: Ti(n, M) + T 2 {n, M) + T^in, M) + T^in, M). 

Step 1: The term Ti. By the invariance hypothesis (Al), we know that [see [48]; Corollary 2.13] 
V C dom (A) so we may consider {dom (A), A) as a bounded operator restricted to V. Moreover, we 
shall represent 


n = PyrTt + Pyr* = Pyr vf + PyC*, 
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where p is the usual projection and h = tq. From Theorem 2.11 in [48], we also know that t i-A 
A{TTY±Vf) is continuous and therefore there exists a constant C such that ||Art|| < C + C'||rt|| for 
every t G [0,T]. Based on these facts, we may use the linear growth conditions (H1-H2) and (Bl) 
to arrive at the following estimate 

(8.5) sup ||Ci|l<C' + C' sup \\rt\\+C sup ||ht||. 

0<t<T 0<t<T 0<t<T 

In this case, one can easily check that assumptions in Proposition 1 of [20] hold trivially and all their 
estimates as well. In this case, we have 

Ti{n,M)<C f f 
Jo Jo 

Lemma 3 in [20] yields Ti{n,M) = 0{p{n)). 

Step 2: The term T 2 + T 3 . Let us now treat T 2 {n,M) + T 3 {n, M). For a given (z,j) G {1,...,to}^, 
Burkholder-Davis-Gundy inequality and (H1-H2) yield 

E / I / f dM,n{£,t)o-^ {re){0)dBjdva\rt){v)dBl\ p{dv) < C f f dii^„{£,t)didt 
J[a,b] ' Jo Jo ' Jq Jo 

This yields u)p^(dw) = 0(p(n)). The same argument also holds for 9«Jra,M,2(0, u) 

and we conclude that 



E|a„ (0,u) + ^vJn,M,2 {0,v)\'^p{dv) 


0{p{n)) 


The drift part is estimated as follows. Cauchy-Schwartz and Burkholder-Davis-Gundy inequalities, 
the estimate (l?m . (H1-H2), Al, Bl and Lemma 3 in m yield 


f E|d„7„,M,i(0,T)|V(dw) < / ( f dlt^„{e,t)d£x f Uefdg 

j[a,b] i=l *^0 *^0 


X \\a{rt)f]dt 


C r f dl,^„{l,t)d£dt = 0{p{n)). 
Jo Jo 


The term 2(0, v) is more evolved but we can repeat the same steps as in the proof of Theorem 

1 in [20] in page 1114 to represent 


d[0,T]2 


where 


Yn,M{u,t,s) ■■= / dM,n{£,t)dre{u). 

Jo 

We fix ry > 0 and we split \In, m, 2(0, v)\‘^ = An,M,i{u,v,r]) + A„^m,2('«, w, 77) so that 


9«^n.M,i(0, u, ry) := 


dvAn,M,2i^,v,r]) := 


r f 

0 J t — T] 
T 1‘t-r} 

0 ^0 


Yn^M{£>,t,t)dv^tiv)Yn,M{0,t ,t )dy^p {v)dt dt 


Yn,M{0,t,t)dv^t{v)Yn^M{0,t ,t )dy^p {v)dt dt 
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By applying the same arguments in the proof of Theorem 1 in |20] with small 77 > 0 together with 
Cauchy-Schwartz inequalities on E and assumptions (H1-H2), Bl, B3, we also have 


[a, 6 ] 


E|/„,M, 2 ( 0 ,u)p 77 (du) = 0{p{n)). 


Moreover, (B1,B3) and Lemma 3 in m yield 


E|/„,M, 3 ( 0 ,u)p/r(d'(;) <C [ [ < Cp{n). 

Jo Jo 


2,b] 


By the symmetry of the other terms, these estimates allow us to conclude that T 2 {n, M)+T^(n, M) = 
0[p{n)). 

Step 3: The term T 4 . For given (i,j) S {1, ■ ■ ■ ,rn}^, Burkholder-Davis-Gundy and Cauchy-Schwartz 
inequalities and (H1-H2) with (18.41) yield 


E 


f \dl^Jn,M,i{u,v)f p{dv)p{du) = f E f ( f dM,n{£,t)dufT^ {ri){u)dBj'] \\a’'{rt)\f dtp{du) 

J[a,b]‘^ J[a,b] Jo ^Jo 

C f f d\i„{t,t)dldt f sup \dv(y^ {rt){-)fp{du). 

Jo Jo J [a,b] 


< 

Therefore, by the symmetry of the martingale terms and Lemma 3 in |2n) we have 

/ E\d^^Jn,M,i{u, v) + J„,m, 2 (m, v)\'^ p{dv)p{du) = 0{p{n)). 

Js.\ 

Similarly, Lemma 3 in m and (E3 yield 


rT rt 


^dvuIn,M.i{u,v)\‘^p{du)p{dv) < 


10 Jo 




X E sup ll^tll^ 

0<i<T 


< c[ [ dl,^„ie,t)d£dt = Oipin)). 
Jo Jo 

Summing up all the inequalities for Ti, T 2 , T 3 and T 4 , we conclude the proof. 

In the sequel, for each e > 0, we define 


□ 


p‘^ := number of non-zero eigenvalues of Qt greater or equal to e a.s 

Corollary 8.1. Assume that Assumptions in Proposition I <9.11 hold and let Q = Range Qt with 
dimension p. Let e —?> 0 m such a way that e^p(n)“^ —>■ 00 as n ^ 00 . Then, P(p'^ ^ p) = 0{e~^p{n)). 

Proof. From Proposition 18.11 we know that EjlQr — Qt||( 2 ) = 0{p„). Since we are considering the 

ordered eigenvalues, 0 i > 02 > • • • > 0 , we have that {p^ > p} = {Sp+i > e}- 

A simple calculation on the Hilbert-Schmidt norm together with 0p+i = 0 a.s yield 

9p+l = |0p+l - 0p-|-l| < WQt - Qt\\{2)- 

Therefore, 

P(p' >P)< e"^E||QT - (5 t||(2 ) = 0{e~'^p{n)). 

By noticing that {9 < p} = {0p_i < e} and 0p_i > 0 a.s, we do the same argument to get 
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F{p <p} = (0p+i - efEWQr - QtWI^) = 0{p{n)) 

Since F{p^ ^ p) = P(p'^ > p) + < p), the result follows. □ 
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